Custom Search

Q4 Take home exam

 

 

 

 

 

a.

a) Compute the OLS regression of the number of crimes on population and 

 

 

 

 

 

population density for all 51 observations.   Test (using both the 

 

 

 

 

 

Goldfeld-Quandt test and the test used by Micro-Fit) whether the null 

 

 

 

 

 

hypothesis that the residuals of the estimated equation are homoscedastic

 

 

 

 

 

can be accepted.  Why might the two tests give different results?

 

 

 

 

 

 Dependent variable is CRIM93

 

 

 

 

 

 51 observations used for estimation from    1 to   51

 

 

 

 

 

*******************************************************************************

 

 

 

 

 

 Regressor              Coefficient       Standard Error         T-Ratio[Prob]

 

 

 

 

 

 CONSTANT                 -37411.8            15369.2            -2.4342[.019]

 

 

 

 

 

 POP93                     62.3154             2.0370            30.5915[.000]

 

 

 

 

 

*******************************************************************************

 

 

 

 

 

 R-Squared                     .95025   R-Bar-Squared                   .94923

 

 

 

 

 

 S.E. of Regression           81486.7   F-stat.    F(  1,  49)  935.8409[.000]

 

 

 

 

 

 Mean of Dependent Variable  277567.9   S.D. of Dependent Variable    361646.9

 

 

 

 

 

 Residual Sum of Squares     3.25E+11   Equation Log-likelihood      -648.0637

 

 

 

 

 

 Akaike Info. Criterion     -650.0637   Schwarz Bayesian Criterion   -651.9955

 

 

 

 

 

* A:Serial Correlation*CHSQ(   1)=   3.2907[.070]*F(   1,  48)=   3.3108[.075]*

 

 

 

 

 

* B:Functional Form   *CHSQ(   1)=   5.6735[.017]*F(   1,  48)=   6.0081[.018]*

 

 

 

 

 

* C:Normality         *CHSQ(   2)= 139.6596[.000]*       Not applicable       *

 

 

 

 

 

* D:Heteroscedasticity*CHSQ(   1)=   2.7690[.096]*F(   1,  49)=   2.8131[.100]*

 

 

 

 

 

Microfit test:

 

 

 

 

 

H0:errors have an increasing variance

 

 

 

 

 

H1:errors have the same variance

 

 

 

 

 

For X2 p value is 0.096

 

 

 

 

 

For F test it is 0.1

 

 

 

 

 

At 5% level accept H0, there is no heteroscedacity

 

 

 

 

 

 

 

 

 

 

 

Goldfeld-Quant

 

 

 

 

 

H0:errors have an increasing variance

 

 

 

 

 

H1:errors have the same variance

 

 

 

 

 

I let c=11 and order the population

 

 

 

 

 

for the first 20

 

 

 

 

 

 Regressor              Coefficient       Standard Error         T-Ratio[Prob]

 

 

 

 

 

 CONSTANT                 970.1958             9217.7             .10525[.917]

 

 

 

 

 

 POP93                     46.2374             6.8773             6.7232[.000]

 

 

 

 

 

 R-Squared

 

0.71519

 

 

 

for the last 20

 

 

 

 

 

 Regressor              Coefficient       Standard Error         T-Ratio[Prob]

 

 

 

 

 

 CONSTANT                -103894.2            49308.7            -2.1070[.049]

 

 

 

 

 

 POP93                     66.8360             4.2202            15.8370[.000]

 

 

 

 

 

*******************************************************************************

 

 

 

 

 

 R-Squared

 

0.93304

 

 

 

lamda=RSS2/RSS1=(1-r2(2))/(1-r2(1))=

 

 

 

0.23510

 

df=(51-11-4)/2=

 

18

Fcrit=

2.2

 

Thus there is likely homoscedacity

 

 

 

 

 

The second model is so much more restrictive. It depends on the c value used etc

 

 

 

 

 

The first model uses much more complicated techniques to spot heterosced. Not

 

 

 

 

 

just assuming that the error term variance depends on the square of

 

 

 

 

ii

b) Plot scatter graphs of the squared residuals from the estimated 

 

 

 

 

 

equation against population and population squared,  Do these plots 

 

 

 

 

 

provide additional help to enable you to decide whether heteroscedasticity 

 

 

 

 

 

is present in your estimated equation?

 

 

 

 

 

RES2

POP93

POP2

 

 

 

766719095.1

470

220900

 

 

 

595208004.6

576

331776

 

 

 

4820165130

579

335241

 

 

 

1118492260

598

357604

 

 

 

245872698.4

637

405769

 

 

 

779652136.5

698

487204

 

 

 

195252979.3

716

512656

 

 

 

639508012.1

841

707281

 

 

 

403465632.3

1000

1000000

 

 

 

124543776.3

1100

1210000

 

 

 

464.3909849

1124

1263376

 

 

 

1439589654

1166

1359556

 

 

 

561947.5822

1240

1537600

 

 

 

1346859740

1382

1909924

 

 

 

10918731.32

1613

2601769

 

 

 

1441629340

1616

2611456

 

 

 

889683797.1

1818

3305124

 

 

 

357718488.3

1860

3459600

 

 

 

8700467.328

2426

5885476

 

 

 

30898941.04

2535

6426225

 

 

 

109527336

2640

6969600

 

 

 

893060053.9

2821

7958041

 

 

 

542088929.5

3035

9211225

 

 

 

50427931.87

3233

10452289

 

 

 

302103622.8

3278

10745284

 

 

 

151343624.5

3564

12702096

 

 

 

649534469.6

3630

13176900

 

 

 

5674345602

3794

14394436

 

 

 

7185974738

3945

15563025

 

 

 

366251383.2

4181

17480761

 

 

 

4072383680

4290

18404100

 

 

 

2123390606

4524

20466576

 

 

 

975773446.1

4958

24581764

 

 

 

2472580407

5044

25441936

 

 

 

171531825.7

5094

25948836

 

 

 

487781730.7

5235

27405225

 

 

 

515791803.6

5259

27657081

 

 

 

4017875529

5706

32558436

 

 

 

1855981642

6018

36216324

 

 

 

9905582453

6473

41899729

 

 

 

1207709802

6902

47637604

 

 

 

8280285.049

6952

48330304

 

 

 

5627096725

7859

61763881

 

 

 

1313168202

9460

89491600

 

 

 

2426505575

11061

122345721

 

 

 

1175526129

11686

136562596

 

 

 

1015955886

12030

144720900

 

 

 

1078000806

13726

188403076

 

 

 

5595257545

18022

324792484

 

 

 

7417663339

18153

329531409

 

 

 

1161983188

31217

974501089

 

 

 

There are some values that are way out. Heteroscedacity is not present, just

 

 

 

 

 

There are some very weird constituents with high (or low) crime rate.

 

 

 

 

iii

One should exclude the outliers. Ie countys with unusually high or low pop or crime

 

 

 

 

 

Result is an OLS model that only applies to "normal" areas.

 

 

 

 

 

Alternatively, there are some more complicated estimation techniques that take

 

 

 

 

 

heteroscedacity into account. (GARCH). However, the resulting equation wont

 

 

 

 

 

be BLUE.

 

 

 

 

 

Finaly one should use population density instead of population to predict crime.

 

 

 

 

 

Population density might be a better explanatory variable for crime

 

 

 

 

iv

Correlating crime rate and population density (pop/area)

 

 

 

 

 

 Dependent variable is CRIM93

 

 

 

 

 

 51 observations used for estimation from    1 to   51

 

 

 

 

 

*******************************************************************************

 

 

 

 

 

 Regressor              Coefficient       Standard Error         T-Ratio[Prob]

 

 

 

 

 

 CONSTANT                 299992.9            53053.6             5.6545[.000]

 

 

 

 

 

 POPDEN                  -189.9849           143.7568            -1.3216[.192]

 

 

 

 

 

*******************************************************************************

 

 

 

 

 

 R-Squared                    .034417   R-Bar-Squared                  .014711

 

 

 

 

 

 S.E. of Regression          358976.9   F-stat.    F(  1,  49)    1.7465[.192]

 

 

 

 

 

 Mean of Dependent Variable  277567.9   S.D. of Dependent Variable    361646.9

 

 

 

 

 

 Residual Sum of Squares     6.31E+12   Equation Log-likelihood      -723.6874

 

 

 

 

 

 Akaike Info. Criterion     -725.6874   Schwarz Bayesian Criterion   -727.6192

 

 

 

 

 

* D:Heteroscedasticity*CHSQ(   1)=   .59156[.442]*F(   1,  49)=   .57503[.452]*

 

 

 

 

 

 

 

 

 

 

 

doing the density gets rid of heteroscedacity, however, popdensity is not

 

 

 

 

 

significant. But this model is restricted. Instead we can use the 2 variables,

 

 

 

 

 

pop and area, separately:

 

 

 

 

 

 Dependent variable is CRIM93

 

 

 

 

 

 51 observations used for estimation from    1 to   51

 

 

 

 

 

*******************************************************************************

 

 

 

 

 

 Regressor              Coefficient       Standard Error         T-Ratio[Prob]

 

 

 

 

 

 CONSTANT                 -48292.5            17359.5            -2.7819[.008]

 

 

 

 

 

 POP93                     62.0446             2.0326            30.5253[.000]

 

 

 

 

 

 AREA                      .068207            .051916             1.3138[.195]

 

 

 

 

 

*******************************************************************************

 

 

 

 

 

 R-Squared                     .95197   R-Bar-Squared                   .94997

 

 

 

 

 

* D:Heteroscedasticity*CHSQ(   1)=   2.2226[.136]*F(   1,  49)=   2.2327[.142]*

 

 

 

 

 

As seen, area is not signifficant, and heteroscedacity has increased, although it

 

 

 

 

 

is not critical. This is the best i can do, i am afraid

 

 

 

 

 

Click here to see more economics,politics and school papers from me