STAT 627 UCLA Multicollinearity Scatterplots Variables Relationship Worksheet
Question Description
Questions from the textbook LSLR
Multicollinearlity
1. (Page 125, chap. 3, #14). This problem focuses on multicollinearity.
- (a) Perform the following commands in R:> set.seed (1)
> x1 = runif (100)
> x2 = 0.5*x1 + rnorm(100)/10
> y = 2 + 2*x1 + 0.3*x2 + rnorm(100)The last line corresponds to creating a linear model in which y is a function of x1 andx2. Write out the form of the linear model. What are the regression coefficients? - (b) What is the correlation between x1 and x2? Create a scatterplot displaying the relation-ship between the variables.
- (c) Using this data, fit a least squares regression to predict y using x1 and x2. Describe theresults obtained. What are ?0, ?1, and ?2? What are the true ?0, ?1, and ?2? Can youreject the null hypothesis H0 : ?1 = 0? How about the null hypothesis H0 : ?2 = 0?
- (d) Now fit a least squares regression to predict y using only x1. Comment on your results.Can you reject the null hypothesis H0 : ?1 = 0?
- (e) Now fit a least squares regression to predict y using only x2. Comment on your results.Can you reject the null hypothesis H0 : ?2 = 0?
- (f) Do the results obtained in (c)(e) contradict each other? Explain your answer.
- (g) Now suppose we obtain one additional observation, which was unfortunately mismea-sured. Use the following R code.
> x1=c(x1, 0.1) > x2=c(x2, 0.8) > y=c(y,6)
Re-fit the linear models from (c) to (e) using this new data. What effect does thisnew observation have on the each of the models? In each model, is this observation anoutlier? A high-leverage point? Both? Explain your answers. How do the slopes fromall the considered models react on the newly added data point?
- (h) What are standard errors of estimated regression slopes in (a), (d), and (e)? Whichmodels produce more stable and therefore, more reliable estimates?
- (i) Compute both VIF in question (a) and relate them to your answer to question (h).
Multicollinearity (Sec. 3.3.3). Classification methods: logistic regression and K-nearest neighbor(sec. 2.2.3, 3.5, 4.1-4.3)
Due February 15. Quiz #3 is on Feb 15.
"Place your order now for a similar assignment and have exceptional work written by our team of experts, guaranteeing you "A" results."