案例统计题 编程实现 Theoretical Programming Questions stata
当前位置:以往案例 > >案例统计题 编程实现 Theoretical Programming Questions stata
2017-07-23

project 2 (due Thursday 2-22-2018)


1. Theoretical


For this exercise, you will take five data points and calculate the coefficients of the bivariate regression model by hand. This is not something you would typically do, but by doing so, I hope the relevant concepts will become more concrete. You will complete this exercise using 5 data points ( = 5) which are combinations of values of a variable and an variable as indicated:


1

2

0

2

4

0

3

3

1

4

5

1

5

7

1

So for instance: 1 = 2 and 1 = 0. If it helps, I had in mind the idea that indicates the number of college applications sent and is an indicator (dummy) variable for female. There are a few pieces to this problem outlined below:


a) Use the formulas for the bivariate regression coefficients to calculate the coefficients and in the regression model:

= − +


The basic formulas are:

公式忽略


and:

=  [  ] −   [  ]


代写作业


where you can find how to calculate the covariance, variance, and expectation in any resource about statistics.


b) Find the difference in mean the mean value of for individuals with = 1 instead of = 0. This should be the same as your coefficient .


c) Copy the table above and extend it by including columns for the fitted value of and the resulting residual for , and residual squared for as well as three additional columns described below.


d) Calculate the residual and the residual squared for each observation based on a value of either one greater than or 1 less than the value of you calculated. (If you found = 8 then calculate the fitted values, resiudals, and resiudals squared based on = 7 or = 9). Verify that the sum of squared residuals from the correct is less than the sum of squared residuals from this incorrect .


2. and 3. Programming Questions



2) You will be doing an exercise that builds on the 1-8 and 2-15 labs. Specifically, you need to analyze a dataset containing information about U.S. States. You will analyze summary statistics, create graphs, break the observations into groups, and run t-tests.


Program the following steps in R. Then, copy your commands into a document, comment on them (using #) and submit the document as the answer to question 2. Please email my TA the code (document with commands) at so xxxxx@ohio.edu with the phrase “ECON 4850 HW 2” in the subject heading.


Dataset notes: This is a dataset about the 50 U.S. states in the 1970’s. A description of the dataset was in project 1.

Instructions:



A. Place or find the folder you have been using for this class on the computer. Create a new subfolder called


“HW 2”.

B. Change the project to the folder you just created.


C. The dataset we are going to use state.x77 should be on your computer. But, make your own copy of the dataset using the following command:

myState2 <- as.data.frame(state.x77)

Normally we wouldn’t need the as.data.frame part, but this dataset is weird.


D. Make a dummy variable called “BigState” that indicates if a state has more than 2,500,000 population. (Remember that Population is measured in 1,000’s.)


E. Create a new dataframe “myState3” that contains both myState2 and the new BigState variable.

F. Look at the summary statistics for the new dataframe.


G. Use the cov command to find the covariance between Income and BigState.

H. Use the var command to find the variance of BigState

I. Run a bivariate regression with Income determined by BigState:


Specification 1:=  + 1+

J. Verify that the coefficient from this regression is the same as the ratio of the covariance between Income and BigState to the variance of BigState.

K. Create a variable called Density that contains the ratio of Population to Area.

L. Add the Density variable to the myState3 dataframe.


M. Run the following regression specification:

Specification 2: = + 1 +  2 +



N. Run a final regression:

Specification 3: = + 1 + 2 + 3 +

O. Submit a printed version of your code with comments along with the answers to questions 1 and 3. Email your code to the TA at soccc@ohio.edu with “ECON 4850 HW 2” in the subject line.


3. Interpret your results: use your STATA results from 2 to answer these questions



a. Answer the following questions about your data:


i. How many observations do you have?


ii. How many variables are in the dataset?


iii. Find the mean and standard deviation for BigState and Income


b. Specification 1:


i. Assuming the regression in specification 1 is correctly specified, what is the interpretation of the coefficient on BigState? Give a reason why you might have guessed this coefficient would be positive. Give a reason why you might have guessed this coefficient would be negative.

ii. Was the result statistically significant? How do we know this?


iii. Verify (by showing your work) that the coefficient on BigState can be calculated as shown on the bottom of page 86 of the “Mastering Metrics” text (or the formula for Beta shown in Problem 1.

c. Specification 2:


i. Assuming the regression in specification 2 is correctly specified, what is the interpretation of the coefficient on Density?

ii. Was the effect of Density statistically significant? How do we know this?


iii. Assuming the regression in specification 2 is correctly specified, what is the interpretation of the coefficient on Illiteracy?


d. Specification 3:


i. Assuming the regression with independent variables in specification 3 is correctly specified, what is the interpretation of the coefficient on Density? Explain why the coefficient might have changed (relative to specification 2).

ii. Was the effect of Density statistically significant? How do we know this?

在线提交订单