PS 5841 Ashford University Mse Bias Variance of Statistical Learning Questions

ACTU PS5841 Data Science in Finance and Insurance – Autumn 2019Dr. Yubo Wang
Assignment-1
Assigned 9/5/19, Due 9/17/19 (Tue)
Problem 1. Statistical Learning
Suppose the observed data are generated by
𝑦 = 1 + 2π‘₯ + πœ–,
π‘₯ ∈ [βˆ’50,50],
πœ– ∈ 𝑁(πœ‡ = 0, 𝜎 2 = 102 )
Use your preferred data analysis tool (a spreadsheet at this stage can be useful to many), demonstrate
numerically that a simple linear regression model 𝑦̂ = 𝛽̂0 + 𝛽̂1 π‘₯ is able to learn.
[a] Specifically, use a test set of size 100 and training sets of various sizes (30, 100, 200, 300),
numerically estimate the corresponding expected test MSE and complete the following table.
Training Set size
Expected Test MSE
30
100
200
300
[b] Please also provide a plot of the expected test MSE against the training set size.
Problem 2. Bias-Variance Trade-off
Suppose the observed data are generated by
π‘₯
𝑦=
+ πœ–,
π‘₯ ∈ [βˆ’25,25],
πœ– ∈ 𝑁(πœ‡ = 0, 𝜎 2 = 0.52 )
2
√1 + π‘₯
Suppose you use polynomial regressions 𝑦̂ = βˆ‘π‘›π‘–=0 𝛽̂𝑖 π‘₯ 𝑖 , 𝑛 = 1, 2, … ,6 to learn from data and make
predictions.
Use your preferred data analysis tool (a spreadsheet at this stage can be useful to many), numerically
demonstrate the trade-off between bias and variance.
Specifically, use 300 training sets and test them on the test set associated with π‘₯ = βˆ’20, βˆ’10, 0, 10, 20.
[a] Please complete the following table with your estimates to demonstrate that the variance-bias
decomposition roughly holds for each model.
degree n
Expeted Test MSE
squred bias
variance
variance of error term
LHS – RHS
1
2
3
4
5
6
[b] Please provide a graph based on your estimates demonstrating the bias-variance trade-off.
Please see notes on linear model and on Excel on the next page.
ACTU PS5841 Data Science in Finance and Insurance – Autumn 2019
Dr. Yubo Wang
Assignment-1
Assigned 9/5/19, Due 9/17/19 (Tue)
Notes on linear model
Μ‚ , the coefficients based on least squares estimation are
Μ‚ = 𝛽̂0 + 𝒙𝑇 𝜷
For a linear model π’š
Μ‚ = (𝑿𝑇 𝑿)βˆ’πŸ 𝑿𝑇 π’š
𝜷
𝑇
Μ‚ = (𝛽̂0 , 𝜷
Μ‚ 𝑇 ) , 𝑿 = (𝟏, 𝒙1 , … , 𝒙𝑝 ) where 𝒙𝑗 = (π‘₯1𝑗 , … , π‘₯𝑛𝑗 )𝑇 , and π’š = (𝑦1 , … , 𝑦𝑛 )𝑇 .
where 𝜷
Notes on Excel
Transposition 𝑨𝑇 = 𝑇𝑅𝐴𝑁𝑆𝑃𝑂𝑆𝐸(𝑨)
Matrix multiplication 𝑨𝑩 = π‘€π‘€π‘ˆπΏπ‘‡(𝑨, 𝑩)
Inverse matrix π‘¨βˆ’1 = 𝑀𝐼𝑁𝑉𝐸𝑅𝑆𝐸(𝑨)
𝑅𝐴𝑁𝐷() returns a number randomly sampled [0,1)
𝑁𝑂𝑅𝑀. 𝐼𝑁𝑉(π‘π‘Ÿπ‘œπ‘π‘Žπ‘π‘–π‘™π‘–π‘‘π‘¦, π‘šπ‘’π‘Žπ‘›, 𝑠𝑑𝑑𝑒𝑣) returns the inverse of the normal cumulative distribution for
the specified mean and standard deviation.
Data->What-if analysis->Data Table is a convenient tool for automating repetitive tasks.
Bias vs Variance (2)
High Bias
Low Variance
Low Bias
High Variance
Prediction Error
Test Sample
Training Sample
Low
High
Model Complexity
Bias vs Variance
3
(3)
E

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper
Order your essay today and save 25% with the discount code: STUDYSAVE

Order a unique copy of this paper

600 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
Top Academic Writers Ready to Help
with Your Research Proposal

Order your essay today and save 25% with the discount code GREEN