Stat 311 Homework 8

This assignment uses the file PenguinsHW8.csv. This is a popular data set; the data were collected and

made available by Dr. Kristen Gorman and the Palmer Station, Antarctica LTER, a member of the Long

Term Ecological Research Network. (Gorman, Williams, and Fraser 2014). This version of the data set was

downloaded from here (on GitHub Gist).

The goal is to predict penguin bill length from penguin bill depth and perhaps species. Bill length and bill

depth are shown in the image below (Image copied from here.)

Regression Model

1. In the HW8 template, we provide code that creates a scatterplot of bill length on bill depth. Describe the

joint relationship between bill length and depth.

2. In the HW8 template, we provide code that creates a second scatterplot that adds color coded plotting

symbols by species. Describe the joint relationship(s) taking species into account.

3. Run a linear regression of bill length on bill depth. Show the regression summary (no interpretations

needed).

4. Write out the regression equation using information from the regression summary.

5. Interpret the estimated slope of the regression equation in the context of the problem.

6. Report and interpret 𝑅𝑅2 .

7. In the HW8 template, we provide ggplot code to create model diagnostic plots (residual plot, histogram

of residuals, normal QQ plot of residuals). Do you think that the assumptions for inference are met?

Explain addressing specific assumptions.

Inference for Regression (assume all assumptions for inference are met)

8. Find a 95% confidence interval for the slope parameter using “by-hand” calculations in a code chunk.

Interpret the interval. [Hint: pull the numbers you need from the regression summary]

9. Perform a hypothesis test to determine if the slope parameter is different than zero. State the hypotheses

using symbols, report the test statistic and degrees of freedom, and the p-value. Include your decision and

an interpretation in the context of the problem. Use a 5% significance level. [Hint: no code needed]

10. Observation 194 is a Gentoo penguin with a bill length of 49.6 mm and a bill depth of 16.0 mm. Using

the model from problem 4, what is the residual for this penguin? Does the model over or underestimate

the bill length for this penguin?

11. Find the 99% confidence interval for the mean bill length when the bill depth is 16.0 mm. Interpret this

interval.

12. Find the 99% prediction interval for a penguin that has a bill depth of 16.0 mm. Interpret this interval.

Consider Species

13. In the HW8 template, we include code to run a model that includes bill depth and species. We start by

allowing the slopes to vary by species (using an asterisk between bill depth and species). Since there are

Stat 311 Homework 8

three species, the first species will be included in the intercept term (Adeline) and the other two species

will have their own coefficients. You should see that both additional interaction terms are statistically

significant. What does this mean?

14. Write out the equation for the model for Gentoo penguins, rounding all coefficients to two decimal

places.

15. Calculate the residual for observation 194 (see Problem 10) based on the new model. Based on the

residual, does this new model do a better job of predicting the bill length for the Gentoo penguin in

observation 194?

Inference for Regression: Using R

Lesson 8, Lecture 2

1

Recap and Going Forward

• In Lecture 1, we covered inference for the regression

slope and, confidence and prediction intervals for

mean values and new predicted individual values of

𝑦 given 𝑥.

• In this lecture we will show you how to use R for

linear regression inference.

Stat 311 – Cardoso

2

2

1

Berkeley Study Data

Data from the Berkeley guidance study of children

born in 1928-29 in Berkeley, CA. Built-in data from the

alr4 package.

𝑟

0.64

3

Stat 311 – Cardoso

3

Fit a Regression Model for Height on Weight

lm.out |t|)

(Intercept)

70.142

1.849 37.937

Order your essay today and save **25%** with the discount code: GREEN