2.66: You can  define a function for this problem with dummy variable fun(dummy) and use sapply(1:200,fun).
But a natural way would be to use "flow control" such as "for". Look at >help("for") # notice that "? for" will not work...
For part d) you may need "if".
So, write a function for one trial. Then include for(i in 1:200) and make appropriate changes... Store all output
data in a data frame (read chapters 5.1 and 5.2 here)
Something like this...

function(n)
{
X=c(4,8,12,16,20)
EY=20+4*X
Yh10=0
left.limit=0
right.limit=0

for(i in 1:n)
 {
 e=rnorm(5,sd=5)
 Y=EY+e
 lr=lm(Y~X)
 b=coef(lr)
 Yh10[i]=b[[1]]+b[[2]]*10
 sde=sqrt(deviance(lr)/df.residual(lr)) #this computes standard deviation of the residuals
 left.limit[i]=....
 right.limit[i]=....


 #plot(X,Y)
 #abline(lr)

 #points(10,Yh10[i],pch="*")
 #segments(10,left.limit[i],10,right.limit[i])
  }
data.frame(Yh10,left.limit,right.limit)
}


I give you my version of the function which works with matrices and do not use for or if. If you decide
to use this version as a basis for your solution, then you must clearly comment all lines (just try them and see what they do).

p266 = function(n,a)
{
  X=c(4,8,12,16,20)
  EY=20+4*X
  EY10=20+4*10

    e=rnorm(5*n,sd=5)
    dim(e)=c(5,n)
    Y=EY+e
    lr=lm(Y~X)
    b=coef(lr)
    Yh10=b[1,]+b[2,]*10
    dev.e=deviance(lr)/df.residual(lr)
    dev.Yh10 = dev.e*( 1/length(X)+(10-mean(X))^2/sum( (X-mean(X))^2 ) )
    s.Yh10=sqrt(dev.Yh10)
    left.limit=Yh10-qt(1-a/2,df.residual(lr))*s.Yh10
    right.limit=Yh10+qt(1-a/2,df.residual(lr))*s.Yh10
    inside = ( (left.limit<EY10) & (EY10<right.limit) )
    
data.frame(Yh10,left.limit,right.limit,b[2,],inside)

# To get answer for part d): > p = p266(200,0.05)     > sum(p$inside)/length(p$inside)

#  The following can be used if you want to get an answer for part d) inside the function
#  list( conf.int.data = data.frame(Yh10,left.limit,right.limit,b[2,],inside),
#       prop.inside   = sum(inside)/length(inside)  )
}

 

Note: When you write a function (say, fun() ) and need to "fix" it, use fix(fun). This way you do not need to re-define it every time
you want make a change.



 3.4: When you create several plots of the same data, you may find useful par(mfrow=c(2,3)). Run it and plot several graphs.
Try c(2,1)....
a) You can built a histogram instead of a dot plot: hist. Read a help of hist on the argument breaks and use an "appropriate"
number of "breaks". Also, use dotchart. This plots a data as dots with a "time" order on a vertical axis.
c) stem is what you need. Recall that you do not have to compute residuals once you regressed your data
(lr$res - how to find it out if you do not remember?). Try stem(lr$res, scale=2)... A look at sorted residuals sort(lr$res)
can help you to figure out how stem-and-leaf diagram works in R.
d) Again, do not compute Y^hat_i. It is a part of lm-object.
e) cor(x,y) gives correlation coefficient between x and y. Note: ordered residuals should be used in the test!
f) The plot is straightforward. acf gives a formal mean to check time correlation... but more on this in "time series" part of the course.
g) Conduct Brown-Forsythe instead of Breusch-Pagan.