Exam 3 Practice Questions1. True or false: The deterministic model is more practically useful.

2. Epsilon is:

a.

b.

c.

d.

What the p-value is referred to in multiple regression

The Greek symbol representing the predictor variable

The error term

Used in simple linear regression but not multiple linear regression

3. True or false: when we add at least one additional predictor variable to a simple linear

regression model, it is called multiple linear regression.

4. True or false: a t-test is the statistic used to determine if a simple linear regression

equation is statistically useful.

5. True or false: we first look at the results of a global F-test to determine if a multiple

regression equation is statistically useful.

6. How can we best determine if a model is practically useful:

a.

b.

c.

d.

The R2 is high

2s is low

The prediction interval is narrow

All of the above

7. When do we use a global F?

a. To test the interaction term in a multiple regression

b. To test the quadratic term

c. To test the levels of a qualitative variable

d. To test the overall fit of a model with multiple predictors

8. True or false: The dependent variable is the predictor variable.

9. If the p-value associated with the t-statistic in a hypothesis test for a simple linear

regression is below alpha, it means:

a.

b.

c.

d.

The model is statistically useful

The model is practically useful

This is the best model

This model works very well

10. If the relationship between the temperature and crowd levels at the beach depends on

whether it is a weekday or a weekend, there is:

a.

b.

c.

d.

A curvilinear relationship

An interaction

A quadratic model

A causal relationship

11. If the relationship between temperature and crowds at the beach is stronger the higher

the temperature gets, there is:

a.

b.

c.

d.

A curvilinear relationship

An interaction

A causal relationship

A qualitative variable

12. To create the quadratic term, we need to:

a.

b.

c.

d.

Square the variable

Multiply the variable by itself

Divide the variable by itself

A or B are both correct

13. True or false: After we find that a model containing two predictors and the interaction

between them is statistically useful, the next step is to run a global F test to see if the

interaction term is useful.

14. The y-intercept can be practically interpreted when:

a.

b.

c.

d.

It makes sense for x to equal zero

An x of zero is within the range of your data set

Either A or B

Both A and B

15. True or false: a quadratic term and an interaction term are the same thing.

16. True or false: in multiple linear regression, the goal is to use two or more variables to

predict the dependent variable.

17. Which of the following fit indices would indicate the best-fitting model:

a.

b.

c.

d.

An R2 of .767 and a standard deviation of 289

An R2 of .622 and a standard deviation of 310

An R2 of .799 and a standard deviation of 270

An R2 of .699 and a standard deviation of 320

18. If we have a qualitative variable with 3 dummy variables, which of the following could be

the qualitative variable:

a.

b.

c.

d.

Season of the year

Month of the year

Days of the week

Grade in elementary school

19. If we can reject the null hypothesis for the global F test for a model containing two

predictors and their interaction, the next step is to:

a.

b.

c.

d.

Divide the p-value by 2

Test each of the predictors to see which work

Stop. You are done.

Look at the t-test for the interaction

20. Studies show that the amount of money people earn is correlated with a higher degree

of happiness but that the strength of this relationship weakens as income increases. This

is an example of a(n):

a.

b.

c.

d.

downward curvature

upward curvature

negative interaction

positive interaction

ANSWERS ON NEXT PAGE

1. True or false: The deterministic model is more practically useful.

False

2. Epsilon is:

a.

b.

c.

d.

What the p-value is referred to in multiple regression

The Greek symbol representing the predictor variable

The error term

Used in simple linear regression but not multiple linear regression

3. True or false: when we add at least one additional predictor variable to a simple linear

regression model, it is called multiple linear regression.

TRUE

4. True or false: a t-test is the statistic used to determine if a simple linear regression

equation is statistically useful.

TRUE

5. True or false: we look at the results of a global F-test to determine if a multiple

regression equation is statistically useful.

TRUE

6. How can we best determine if a model is practically useful:

a. The R2 is high

b. 2s is low

c. The prediction interval is narrow

d. All of the above

7. When do we use a global F?

a. To test the interaction term in a multiple regression

b. To test the quadratic term

c. To test the levels of a qualitative variable

d. To test the overall fit of a model with multiple variables

8. True or false: The dependent variable is the predictor variable.

FALSE

9. If the p-value associated with the t-statistic in a hypothesis test for a simple linear

regression is below alpha, it means:

a.

b.

c.

d.

The model is statistically useful

The model is practically useful

This is the best model

This model works very well

10. If the relationship between the temperature and crowd levels at the beach depends on

whether it is a weekday or a weekend, there is:

a. A curvilinear relationship

b. An interaction

c. A quadratic model

d. A causal relationship

11. If the relationship between temperature and crowds at the beach is stronger the higher

the temperature gets, there is:

a.

b.

c.

d.

A curvilinear relationship

An interaction

A causal relationship

A qualitative variable

12. To create the quadratic term, we need to:

a. Square the variable

b. Multiply the variable by itself

c. Divide the variable by itself

d. A or B are both correct

13. True or false: After we find that a model containing two predictors and the interaction

between them is statistically useful, the next step is to run a global F test to see if the

interaction term is useful.

FALSE

14. The y-intercept can be practically interpreted when:

a. It makes sense for x to equal zero

b. An x of zero is within the range of your data set

c. Either A or B

d. Both A and B

15. True or false: a quadratic term and an interaction term are the same thing.

FALSE

16. True or false: in multiple linear regression, the goal is to use two or more variables to

predict the dependent variable.

TRUE

17. Which of the following fit indices would indicate the best-fitting model:

a.

b.

c.

d.

An R2 of .767 and a standard deviation of 289

An R2 of .622 and a standard deviation of 310

An R2 of .799 and a standard deviation of 270

An R2 of .699 and a standard deviation of 320

18. If we have a qualitative variable with 3 dummy variables, which of the following could be

the qualitative variable:

a.

b.

c.

d.

Season of the year

Month of the year

Days of the week

Grade in elementary school

19. If we can reject the null hypothesis for the global F test for a model containing two

predictors and their interaction, the next step is to:

a. Divide the p-value by 2

b. Test each of the predictors to see which work

c. Stop. You are done.

d. Look at the t-test for the interaction

20. Studies show that the amount of money people earn is correlated with a higher degree

of happiness but that the strength of this relationship weakens as income increases. This

is an example of a(n):

a.

b.

c.

d.

downward curvature

upward curvature

negative interaction

positive interaction

Lecture 18: Exam 3 Review

Linear Regression

Regression Goal: Predict the value of one QN

variable from values of related variables.

Think inputs and outputs!

Dependent variable (DV): QN variable to be

predicted (y)

Independent variables (IVs): predictor variables

(x1,x2…)

Experimental unit: object upon which

measurements (y, x) are taken

Linear Regression

“Linear” → use a straight-line model to relate y to x

Two types of linear models:

Deterministic: y = β0 + β1x

β0 = y-intercept

β1 = slope

Probabilistic: y = β0 + β1x + ε

ε = random error

Steps to follow in regression

1) Hypothesize the model: E(y) = β0 + β1x……

2) Assumptions on random error

3) Collect data; estimate betas

0 + 𝛽

1 x

Yields prediction equation: 𝑦ො = 𝛽

e.g., : 𝑦ො = 10 + 2x

4) Test model utility (Is model useful for predicting y?)

5) If yes, use model for prediction/inferences

Note: Steps 2 and 3 are interchangeable

Multiple Regression

Multiple independent variables (IVs): x1, x2, x3, x4, … , xk

Independent variables can be quantitative or qualitative

Model: E(y) = β0 + β1×1 + β2×2 + … + βkxk

Example 12.1: (p. 687): Predict auction price of GF clock

Dependent Variable

y = Auction Price (Experimental unit = a single clock)- QN

Independent Variables

x1 = Age of clock (in years) – QN

x2 = Number of bidders on the clock – QN

Multiple Regression

Theory: Price increases linearly with Age & # Bidders

Step 1: E(y) = β0 + β1×1 + β2×2 (1st-order model)

Model proposes 2 straight-lines:

(1) relating y to x1 (with slope β1)

(2) relating y to x2 (with slope β2)

y

y

Slope= β1

Age (x1)

Slope= β2

Bidders (x2)

STATISTIX software results:

Least Squares Linear Regression of PRICE

Predictor

Variables

Constant

AGE

NUMBIDS

Coefficient

-1338.95

12.7406

85.9530

Std Error

173.809

0.90474

8.72852

T

-7.70

14.08

9.85

R²

Adjusted R²

AICc

319.55

PRESS

646070

0.8923

0.8849

Source

DF

Regression 2

Residual

29

Total

31

SS

MS

F

4283063 2141531 120.19

516727 17818.2

4799790

Cases Included 32

P

0.0000

0.0000

0.0000

Mean Square Error (MSE)

Standard Deviation

VIF

0.0

1.1

1.1

17818.2

133.485

P

0.0000

Missing Cases 0

Least Squares prediction equation:

𝑦ො = -1339 + 12.74×1 + 85.95×2

MR Example

Interpreting estimated betas: E(y) = β0 + β1×1 + β2×2

𝛽መ1 = 12.74:

For every 1 year increase in Age (x1), we estimate

Price (y) to increase $12.74, holding Number of

bidders (x2) fixed (e.g., after accounting for x2)

𝛽መ2 = 85.95:

For every 1 bidder increase in Number of bidders

(x2), we estimate Price (y) to increase $85.95,

holding Age (x1) fixed (e.g., after accounting for x1)

Is the model statistically useful?

❖Conduct “global” F-test

E(y) = β0 + β1×1 + β2×2

Test: H0: β1 = β2=0 (model is not useful)

Ha: At least 1 β ≠0 (model is “statistically useful”)

Test statistic: F = # from the printout = 120.19

P-value: p = # from printout = 0.000

❖Individual t-tests and for x1 and x2

Is the model practically useful?

Look at R2 = .892 (SX printout)

Interpretation:

89.2% of the sample variation in auction prices (y)

can be explained by the 1st-order model with Age (x1)

and Number of Bidders (x2).

Look at 2s = 2(133.5) = 267 (SX printout)

Interpretation:

95% of the sampled auction prices will fall within $267

of their predicted values using the 1st-order model

with Age (x1) and Number of Bidders (x2).

Numerical Measures of Model Fit

1) Coefficient of Determination (R2)

2) Coefficient of Correlation (r)

Coefficient of Determination – measures percentage of

variation in y “explained” by the model

Coefficient of Correlation – measures the strength and

the direction of the linear relationship between y and x

Final Step: Using the model

1) Predict Price (y) for a GF clock with …

Age(x1)=150 years and # Bidders (x2) =5 bidders

2) Estimate mean Price for all GF clocks with …

Age(x1)=150 years and # Bidders (x2) =5 bidders

STATISTIX software results:

Predicted/Fitted Values of PRICE

Lower Predicted Bound

Predicted Value

Upper Predicted Bound

SE (Predicted Value)

713.61

1001.9

1290.2

140.95

Unusualness (Leverage)

Percent Coverage

Corresponding T

0.1151

95

2.05

Lower Fitted Bound

Fitted Value

Upper Fitted Bound

SE (Fitted Value)

Predictor Values: AGE=150.00, NUMBIDS=5.0000

909.29

1001.9

1094.5

45.279

MR Step 5

95% PI for y: (714, 1290)

Interpretation: We are 95% confident that the price

of a single 150 year old GF clock with 5 bidders will

fall between $714 & $1290.

95% CI for E(y): (909, 1095)

Interpretation: We are 95% confident that the

average price of all 150 year old GF clocks with 5

bidders will fall between $909 & $1,095.

Interaction Model

• The relationship between y and x1 depends on x2

• The relationship between y and x2 depends on x1

E(y) = β0 + β1×1 + β2×2 + β3×1 x2

Slope of y vs. x1 line = (β1 + β3×2)

Slope of y vs. x2 line = (β2 + β3×1)

No interaction:

• parallel

• same slope

Interaction:

• Non-parallel

• different slopes

STATISTIX software results:

Least Squares Linear Regression of PRICE

Predictor

Variables

Constant

AGE

NUMBIDS

AGEBIDS

Coefficient

320.458

0.87814

-93.2648

1.29785

R²

Adjusted R²

AICc

PRESS

0.9539

0.9489

295.25

288487

Source

DF

Regression 3

Residual

28

Total

31

SS

4578427

221362

4799789

Cases Included 32

Std Error

295.141

2.03216

29.8916

0.21233

T

1.09

0.43

-3.12

6.11

P

0.2868

0.6690

0.0042

0.0000

Mean Square Error (MSE)

Standard Deviation

MS

1526142

7905.79

F

193.04

VIF

0.0

12.2

28.3

30.5

7905.79

88.9145

P

0.0000

Missing Cases 0

Least Squares prediction equation:

𝑦ො = 321 + .878×1 -93.27×2 + 1.298x1x2

Statistically Useful?

Model: E(y) = β0 + β1×1 + β2×2 + β3x1x2

Test H0: β1 = β2 = β3= 0 (nothing in the model works)

Ha: At least one βi is not 0 (something works)

Global F=193, p-value= 0

Conclusion: α=.05 > p-value= 0 → Reject H0

Test H0: β3 = 0 (no interaction)

Ha: β3 > 0 (positive interaction-slope increases)

t-value =6.11, p-value=0/2 = 0

Conclusion: α=.05 > p-value=0 → Reject H0

Interaction Model

Caveat #1: Avoid interpreting other t-tests

Caveat #2: Be careful when interpreting beta

estimates—just don’t do it

Adjusted-R2 = .949 Goes up with interaction included

94.9% of the sample variation in auction prices (y) can

be explained by the interaction model with Age (x1) and

Number of Bidders (x2).

2sd=88.92(2)= 178 Goes down with interaction included

95% of the sampled auction prices will fall within $178 of

their predicted values using the interaction model with

Age (x1) and Number of Bidders (x2).

Quadratic (Curvilinear) Model

E(y) = β0 + β1x + β2×2 (2nd-order model)

Graphs as a “quadratic” or curve relating y to x

Quadratic Model

Experimental unit = home (sample n = 15 homes)

Dependent Variable: Monthly Electric Usage (QN)

Independent Variable: Size of Home (QN)

Quadratic Model

Theory:

Rate of increase of usage (y) with

size (x) is slower for larger homes

Scatter Plot of USAGE vs SIZE

2100

USAGE

1900

1700

1500

1300

1100

1200

2000

2800

SIZE

3600

STATISTIX software results:

Least Squares Linear Regression of USAGE

Predictor

Variables

Constant

SIZE

SIZESQ

Coefficient

-806.717

1.96162

-3.404E-04

R²

Adjusted R²

AICc

PRESS

0.9773

0.9735

126.13

56695

Source

Regression

Residual

Total

DF

2

12

14

Cases Included 15

Std Error

166.872

0.15252

3.212E-05

T

-4.83

12.86

-10.60

P

0.0004

0.0000

0.0000

Mean Square Error (MSE)

Standard Deviation

SS

1300900

30240

1331140

MS

650450

2520.02

Missing Cases 0

F

258.11

VIF

0.0

74.2

74.2

2520.02

50.1998

P

0.0000

Quadratic Model

Estimate betas: E(y) = β0 + β1x + β2×2

𝛽መ0 = -806.7, 𝛽መ1 = 1.96, 𝛽መ2 = -.00034

Interpretations:

β0: y-intercept of curve

No practical int. since Size(x)=0 is nonsensical

β 1: not a slope, but a shift parameter

Shifts parabola right or left along the x-axis;

No practical interpretation

β 2: Rate of curvature; larger the number the faster the

rate; Negative “sign” indicates downward

curvature

Quadratic Model

Model: E(y) = β0 + β1x + β2×2

Test: H0: β1 = β2 = 0

Ha: At least one β is not zero

Global-F Test Statistic = 258, p-value=0

Conclusion: α=.05 > p-value=0 → Reject H0

Test: H0: β2 = 0 (no curvature)

Ha: β2 < 0 (downward curvature)
t-value =-10.60, p-value=0/2 = 0
Conclusion: α=.05 > p-value=0 → Reject H0

Quadratic Model

Caveat #1: Avoid interpreting other t-tests

Caveat #2: Be careful when predicting y

Avoid “extrapolation” – selecting x outside range of the sample

Adjusted-R2 = .973

97.3% of the sample variation in Usage (y) values

can be explained by the quadratic model with

Size (x)

2s = 2(50) = 100

95% of the sampled Usage (y) values will fall within 100 kwhours of their predicted values using the quadratic model

with Size (x)

Modeling Qualitative Data (2 levels)

Discrimination in the workplace example:

Model Salary of USF professor based on Gender (M,F)

Dependent Var: y=salary; Independent Var: Gender (M,F)

Experimental Unit = a single USF professor

QL Var with 2 levels: Model using x = {1 if Female, 0 if Male}

– referred to as a “dummy” variable

– the value assigned zero is called the “base” level

Model: E(y) = β0 + β1x

-where E(y) is the mean salary

Modeling Qualitative Data (2 levels)

Interpreting betas in dummy variable model:

E(y) = β0 + β1x , where x = {1 if Female, 0 if Male}

x=0: E(y) = β0 = Mean salary for Males (𝝁𝑴 )

x=1: E(y) = β0 + β1 = Mean salary for Females (𝝁𝑭 )

β1 = 𝝁𝑭 – 𝝁𝑴 = Difference between mean salary of

Females and mean salary of Males

Test to Conduct:

H0: β1 = 0 (𝜇𝐹 =𝜇𝑀 ; no discrimination)

Ha: β1 < 0 (𝜇𝐹

Don't use plagiarized sources. Get Your Custom Essay on

Statistics & Probability Worksheet

Just from $13/Page

Why Work with Us

Top Quality and Well-Researched Papers

We always make sure that writers follow all your instructions precisely. You can choose your academic level: high school, college/university or professional, and we will assign a writer who has a respective degree.

Professional and Experienced Academic Writers

We have a team of professional writers with experience in academic and business writing. Many are native speakers and able to perform any task for which you need help.

Free Unlimited Revisions

If you think we missed something, send your order for a free revision. You have 10 days to submit the order for review after you have received the final document. You can do this yourself after logging into your personal account or by contacting our support.

Prompt Delivery and 100% Money-Back-Guarantee

All papers are always delivered on time. In case we need more time to master your paper, we may contact you regarding the deadline extension. In case you cannot provide us with more time, a 100% refund is guaranteed.

Original & Confidential

We use several writing tools checks to ensure that all documents you receive are free from plagiarism. Our editors carefully review all quotations in the text. We also promise maximum confidentiality in all of our services.

24/7 Customer Support

Our support agents are available 24 hours a day 7 days a week and committed to providing you with the best customer experience. Get in touch whenever you need any assistance.

Try it now!

How it works?

Follow these simple steps to get your paper done

Place your order

Fill in the order form and provide all details of your assignment.

Proceed with the payment

Choose the payment system that suits you most.

Receive the final file

Once your paper is ready, we will email it to you.

Our Services

No need to work on your paper at night. Sleep tight, we will cover your back. We offer all kinds of writing services.

Essays

No matter what kind of academic paper you need and how urgent you need it, you are welcome to choose your academic level and the type of your paper at an affordable price. We take care of all your paper needs and give a 24/7 customer care support system.

Admissions

Admission Essays & Business Writing Help

An admission essay is an essay or other written statement by a candidate, often a potential student enrolling in a college, university, or graduate school. You can be rest assurred that through our service we will write the best admission essay for you.

Reviews

Editing Support

Our academic writers and editors make the necessary changes to your paper so that it is polished. We also format your document by correctly quoting the sources and creating reference lists in the formats APA, Harvard, MLA, Chicago / Turabian.

Reviews

Revision Support

If you think your paper could be improved, you can request a review. In this case, your paper will be checked by the writer or assigned to an editor. You can use this option as many times as you see fit. This is free because we want you to be completely satisfied with the service offered.