Linear Regression Study the Relationship Between Two Variables Questions

YX
0.07
0.09
0.08
0.16
0.17
0.21
0.49
0.58
0.53
1.22
1.15
1.07
2.84
2.57
3.1
9
9
9
7
7
7
5
5
5
3
3
3
1
1
1
1. A researcher study the relationship between two variables X and Y. The data is shown below
Y
X
5.5
1
5.1
1
6.9
2
5.8
2
6.2
3
6.1
3
6.9
4
5.4
4
5.6
5
a). Use R to compute the linear regression model on the data and attach the code. Give a
summary of the result
Then figure the following items and fill in the blanks.
The intercept of the linear model is ________the slope is _______. The standard error for the
intercept estimate is ________, for the slope is _________. The linear correlation coefficient
between X and Y is ___________, and judge by the sign of ______, X and Y are
_____(positively/negatively) related. The MSE=_________. SSE=_________
In order to test the following hypothesis test.
𝐻0 : 𝛽1 = 0
π»π‘Ž: 𝛽1 β‰  0
If a T-test is performed, the test statistic is ______, and the p value is _______.
If a F-test is performed, the test statistic is _______, and the p value is _______.
How is the T test statistic and F statistic related? _________________.
R result is only for non-directional test. Discuss how to modify the p value given in the output
if the hypothesis is changed to
𝐻0 : 𝛽1 = 0
π»π‘Ž: 𝛽1 > 0
And 𝐻0 : 𝛽1 = 0
π»π‘Ž: 𝛽1 < 0 b). Use the linear correlation coefficient found in a), find a 90% confidence interval for the correlation coefficient. The Fisher z transformation value zβ€² is ___________, Compute the confidence interval for the 𝑧′ : Then back transform to get the confidence interval for the linear correlation coefficient: 1 7 5 According to the confidence interval, do you think the X and Y has a _________(significant/insignificant) linear association? c) perform a lack of fit test for the question. Compute SSPE, SSLF, dfPE and dfLF from the dataset. Verify if your SSPE + SSLF = SSE. Compute the F test statistic = MSLF/MSPE then conclude the lack of fit test. d) Use R to define the full model and reduced model, then verify your computation in c). 2. Use R to simulate a data set with non-linear violation π‘Œ = 𝑋 3 + 10𝑋 + 20 + 𝑁(0, 25), Where X is a sequence ranging from 1 to 100, each repeated twice, i.e., x is rep(seq(1:100), 2)). a) Draw and comment on the scatter plot and residual plot. Which plot is easier to tell the linear violation? b) Show the lm regression result, the 𝑅 2 = _______, and the F test on the linear impact is _________(significant/insignificant). c) Perform a lack of fit of the model on the data. The F test on the lack of fit test is ______(significant/insignificant), meaning the model ______(fits/doesn’t fit). d) Use your words, discuss the consequence of fitting a linear model to a nonlinear data. 3. This question is on model diagnostic procedure and requires R. Highlight and interpret the relevant R outputs when answer the question. A chemist studied the concentration of a solution (Y) over time (X). Fifteen identical solutions were prepared. The 15 solutions were randomly divided into five sets of three, and the five sets were measured, respectively, after 1,3,5,7 and 9 hours. The results is in concentration.xlxs Perform the diagnostics on the data. Hint: you can use the residualPlots(Model) to check the residual plots, and obtain the residuals via Model$residuals, where Model is the result of the linear regression function, lm(y~x) a) List all assumptions we need to check for a linear regression model. b) Plot the scatter plot of X and Y, do they appear linear to you? c) Plot the dependent variable versus the explanatory variable and comment on the shape and any unusual points. Plot the residuals versus the explanatory variable and briefly describe the plot noting any unusual patterns or points. d) Perform a Brown-Forsythe test to access whether the random errors have constant variance. State the Ho and Ha for the Brown-Forsythe test and then conclude whether the random errors have constant variance, at level of 0.05. 2 e) examine the distribution of the random error via residuals. Perform a QQplot and a Shapirotest on the residuals, state the Ho and Ha for the Shapiro-test, and then conclude whether the residuals Normal distribution. f) Perform a lack-of-fit test on the data, fill in the following blanks based on the R output SSPE=________, SSLF=________, MSLF=_________, MSPE=________. Fs=___________. P-vale=________. The model _____(fits/ does not fit) the data at level of 0.05. 4. Refer to question 3. This question is on model remedy procedure and requires R. Highlight and interpret the relevant R outputs when answer the question. a.) When assumptions are violated, we could consider transform X or Y, or both. For non-constant or non-normal error variance, we will consider transform Y with a Box-Cox procedure. Use boxcox() from the MASS package and the boxcox.sse() to determine the Box-Cox parameter, π›Œ. The boxcox() procedure suggests to use a Ξ» = ______ and the boxcox.sse() procedure suggests to use a πœ† = ________. The transformation function on Y should therefore be Y β€² = __________. b) create a variable for π‘Œβ€² on the data set, then perform the diagnostic procedure to verify that a linear model suits the data. c) Use R to find a linear regression on the transformed response variableYβ€². Then compute the point estimate, and standard error for predicting mean of this response whenπ‘‹β„Ž = 7.5, then compute a 90% confidence interval for mean response when π‘‹β„Ž = 7.5. Conclude that we are 90% confidence that the mean response value Yβ€² is at least ______ and at most ______. d) Back transform the confidence interval to get a 90% confidence interval of the (original) mean response value Y. 3

Order your essay today and save 25% with the discount code: STUDYSAVE

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Order a unique copy of this paper

600 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
Top Academic Writers Ready to Help
with Your Research Proposal

Order your essay today and save 25% with the discount code GREEN