2.The cost of unemployment is a major issue both at the state and federal levels. What drives the cost is not only the number of unemployed but also the length of unemployment for each person. The longer a person is unemployed, the higher the cost to governments and businesses. To address the factors driving the length of unemployment for the manufacturing sector, data was collected on the number of weeks a person is unemployed due to a layoff along with a series of independent variables as shown below.
Edu:The number of years of education.
Married:An indicator of whether (Y) or not (N) a person is married.
Head:An indicator of whether (Y) or not (N) the person is the head of a household.
Tenure:The number of years on the most recent job.
Manager:An indicator of whether (Y) or not (N) the person was in a management position. Sales:An indicator of whether (Y) or not (N) the person was in a sales occupation.
Age:The current age (in years) of the individual worker
The table shown below is data for 50 displaced workers. This data set is also on Canvass (Exam folder) in Minitab format (LAYOFFS.MTW) so you don’t have to re-enter the data.
(a) Develop an estimated regression model to predict length of unemployment using all of the variables.
Regression Analysis: Weeks versus Age, Educ, ...
The regression equation is
Weeks = 22.9 + 1.51 Age - 0.613 Educ - 10.7 Married - 19.8 Head + 0.426 Tenure
- 26.7 Manager - 18.6 Sales
Predictor Coef SECoef T P
Constant 22.85 18.87 1.21 0.233
Age 1.5093 0.3040 4.96 0.000
Educ -0.6133 0.9362 -0.66 0.516
Married -10.743 6.012 -1.79 0.081
Head -19.779 5.837 -3.39 0.002
Tenure 0.4265 0.4669 0.91 0.366
Manager -26.742 8.326 -3.21 0.003
Sales -18.561 6.281 -2.96 0.005
S = 16.3497 R-Sq = 59.1% R-Sq(adj) = 52.3%
Analysis of Variance
Source DF SS MS F P
Regression 7 16250.4 2321.5 8.68 0.000
Residual Error 42 11227.1 267.3
Total 49 27477.5
Source DF Seq SS
Age 1 9161.4
Educ 1 71.6
Married 1 7.0
Head 1 2064.6
Tenure 1 666.8
Manager 1 1944.4
Sales 1 2334.6
Unusual Observations
ObsAge Weeks Fit SE Fit Residual St Resid
10 33.0 13.00 46.37 5.51 -33.37 -2.17R
24 23.0 7.00 38.47 5.81 -31.47 -2.06R
R denotes an observation with a large standardized residual.
(b) State the hypotheses one needs to test for significance of the model developed in part (a).
(c) Is the model developed in (a) significant at a 5% significance level? Explain.
(d) State and interpret the R2 value.
(e) Conduct a residual analysis and comment on the results.
(f) Are all of the variables in the full model significant? If not, which ones are not significant? Explain.
(g) Is there any problem with multicollinearity? Explain. If so what should one do about it?
Correlations: Age, Educ, Married, Head, Tenure, Manager, Sales
Age Educ Married Head Tenure Manager
Educ 0.100
0.490
Married -0.209 -0.151
0.145 0.296
Head 0.027 -0.156 -0.449
0.854 0.280 0.001
Tenure 0.459 0.174 -0.057 -0.046
0.001 0.228 0.692 0.750
Manager 0.097 0.160 0.073 -0.200 -0.113
0.504 0.266 0.616 0.164 0.435
Sales 0.137 0.124 -0.148 -0.013 0.097 -0.156
0.343 0.393 0.306 0.926 0.504 0.279
Cell Contents: Pearson correlation
P-Value
(h) Develop your own model which you think is the most appropriate predictor of weeks of unemployment. Explain the results and why you think this is the best model.
Regression Analysis: Weeks versus Age, Head, Manager, Sales
The regression equation is
Weeks = - 0.07 + 1.73 Age - 15.1 Head - 28.7 Manager - 17.4 Sales
Predictor Coef SECoef T P
Constant -0.069 9.843 -0.01 0.994
Age 1.7252 0.2651 6.51 0.000
Head -15.086 5.121 -2.95 0.005
Manager -28.672 8.117 -3.53 0.001
Sales -17.421 6.236 -2.79 0.008
S = 16.5069 R-Sq = 55.4% R-Sq(adj) = 51.4%
Analysis of Variance
Source DF SS MS F P
Regression 4 15216.0 3804.0 13.96 0.000
Residual Error 45 12261.5 272.5
Total 49 27477.5
Source DF Seq SS
Age 1 9161.4
Head 1 1339.8
Manager 1 2588.1
Sales 1 2126.7
Unusual Observations
ObsAge Weeks Fit SE Fit Residual St Resid
24 23.0 7.00 39.61 5.29 -32.61 -2.09R
39 62.0 80.00 89.47 9.36 -9.47 -0.70 X
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large leverage.
Correlations: Age, Head, Manager, Sales
Age Head Manager
Head 0.027
0.854
Manager 0.097 -0.200
0.504 0.164
Sales 0.137 -0.013 -0.156
0.343 0.926 0.279
Cell Contents: Pearson correlation
P-Value
(i) Using your best model, what would be the estimated length of unemployment a person with the following characteristics: Age = 40; Education level = 16 years; Married; Head of household; Tenure = 18 years; Not a manager; Not in a sales occupation.
Predicted Values for New Observations
New ObsFit SE Fit 95% CI 95% PI
1 53.85 3.50 (46.81, 60.89) (19.87, 87.84)
Values of Predictors for New Observations
New ObsAge Head Manager Sales
1 40.0 1.00 0.000000 0.000000
(j) Develop and interpret a 95% confidence interval and prediction interval for the individual described in part (i) above.
(k) How might the results of your model be used to help both state and federal government deal with the length of unemployment problem?