University California San Diego R Studio Computer Science Questions

Stat 311 Homework 8
This assignment uses the file PenguinsHW8.csv. This is a popular data set; the data were collected and
made available by Dr. Kristen Gorman and the Palmer Station, Antarctica LTER, a member of the Long
Term Ecological Research Network. (Gorman, Williams, and Fraser 2014). This version of the data set was
downloaded from here (on GitHub Gist).
The goal is to predict penguin bill length from penguin bill depth and perhaps species. Bill length and bill
depth are shown in the image below (Image copied from here.)
Regression Model
1. In the HW8 template, we provide code that creates a scatterplot of bill length on bill depth. Describe the
joint relationship between bill length and depth.
2. In the HW8 template, we provide code that creates a second scatterplot that adds color coded plotting
symbols by species. Describe the joint relationship(s) taking species into account.
3. Run a linear regression of bill length on bill depth. Show the regression summary (no interpretations
needed).
4. Write out the regression equation using information from the regression summary.
5. Interpret the estimated slope of the regression equation in the context of the problem.
6. Report and interpret 𝑅𝑅2 .
7. In the HW8 template, we provide ggplot code to create model diagnostic plots (residual plot, histogram
of residuals, normal QQ plot of residuals). Do you think that the assumptions for inference are met?
Explain addressing specific assumptions.
Inference for Regression (assume all assumptions for inference are met)
8. Find a 95% confidence interval for the slope parameter using “by-hand” calculations in a code chunk.
Interpret the interval. [Hint: pull the numbers you need from the regression summary]
9. Perform a hypothesis test to determine if the slope parameter is different than zero. State the hypotheses
using symbols, report the test statistic and degrees of freedom, and the p-value. Include your decision and
an interpretation in the context of the problem. Use a 5% significance level. [Hint: no code needed]
10. Observation 194 is a Gentoo penguin with a bill length of 49.6 mm and a bill depth of 16.0 mm. Using
the model from problem 4, what is the residual for this penguin? Does the model over or underestimate
the bill length for this penguin?
11. Find the 99% confidence interval for the mean bill length when the bill depth is 16.0 mm. Interpret this
interval.
12. Find the 99% prediction interval for a penguin that has a bill depth of 16.0 mm. Interpret this interval.
Consider Species
13. In the HW8 template, we include code to run a model that includes bill depth and species. We start by
allowing the slopes to vary by species (using an asterisk between bill depth and species). Since there are
Stat 311 Homework 8
three species, the first species will be included in the intercept term (Adeline) and the other two species
will have their own coefficients. You should see that both additional interaction terms are statistically
significant. What does this mean?
14. Write out the equation for the model for Gentoo penguins, rounding all coefficients to two decimal
places.
15. Calculate the residual for observation 194 (see Problem 10) based on the new model. Based on the
residual, does this new model do a better job of predicting the bill length for the Gentoo penguin in
observation 194?
Inference for Regression: Using R
Lesson 8, Lecture 2
1
Recap and Going Forward
• In Lecture 1, we covered inference for the regression
slope and, confidence and prediction intervals for
mean values and new predicted individual values of
𝑦 given 𝑥.
• In this lecture we will show you how to use R for
linear regression inference.
Stat 311 – Cardoso
2
2
1
Berkeley Study Data
Data from the Berkeley guidance study of children
born in 1928-29 in Berkeley, CA. Built-in data from the
alr4 package.
𝑟
0.64
3
Stat 311 – Cardoso
3
Fit a Regression Model for Height on Weight
lm.out |t|)
(Intercept)
70.142
1.849 37.937

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper
Order your essay today and save 25% with the discount code: STUDYSAVE

Order a unique copy of this paper

600 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
Top Academic Writers Ready to Help
with Your Research Proposal

Order your essay today and save 25% with the discount code GREEN