MyAssignmentServices uses cookies to deliver the best experience possible. Read more
The coronavirus disease (COVID19) has spread rapidly around the world following its initial outbreak in the City of Wuhan, the capital of Hubei province of China. By the beginning of 2021, the COVID19 has affected almost all countries and territories across the world with the global deathtoll exceeding two million. Although all of the factors that contributed to the rapid spread of the virus are not precisely known yet, it is believed that socioeconomic activities requiring interpersonal interactions, certain longterm health conditions, and lifestyle may have acted behind the unprecedented spread of the disease. To capture the effects of such factors on the number of people infected, as an econometrician, you decide to choose variables representing level of the economic development, population characteristics and the geographical locations of various countries of the world as of 1 February 2021. The dataset [MAE256 T1 2021 Assignment Data] for the assignment is provided by on the MAE256 unit site on CloudDeakin and contains information on the continent of each country (Continent), total number of infected people (Cases), Gross Domestic Product per capita (GDP), population density (POP), percentage of population aged more than 70 years (Pop70), and the prevalence of diabetes (Diabetes). The dataset for this assignment has been obtained from: https://ourworldindata.org/coronavirusdata.
NOTE: You need to use the dataset provided by the Unit Team on CloudDeakin for the assignment. Please include all Excel output tables for summary statistics and regressions, and all figures in your submission.
Variable definitions
Country: The name of each country in the dataset
Continent: The continent of each country in the dataset
Cases: Total number of infected people
GDP: Gross Domestic Product per person (in AUD)
POP: Population density (number of people per square kilometres of land area)
Pop70: Percentage of population who are aged over 70
Diabetes: Percentage of people aged 2079 who have type 1 or type 2 diabetes
Solution: Let us have a closer look at the descriptive statistics of the variable Cases and GDP.
Cases 
GDP 

Mean 
582932.5057 
Mean 
23485.84714 
Standard Error 
175785.8242 
Standard Error 
1901.699237 
Median 
65817.5 
Median 
15075.20898 
Mode 
1 
Mode 
#N/A 
Standard Deviation 
2318774.276 
Standard Deviation 
25085.13579 
Sample Variance 
5.37671E+12 
Sample Variance 
629264037.7 
Kurtosis 
91.2485313 
Kurtosis 
4.849364972 
Skewness 
8.846743578 
Skewness 
1.929834591 
Range 
26321119 
Range 
149069.6923 
Minimum 
1 
Minimum 
847.7435897 
Maximum 
26321120 
Maximum 
149917.4359 
Sum 
101430256 
Sum 
4086537.403 
Count 
174 
Count 
174 
Largest(1) 
26321120 
Largest(1) 
149917.4359 
Smallest(1) 
1 
Smallest(1) 
847.7435897 
Confidence Level(95.0%) 
346961.0213 
Confidence Level(95.0%) 
3753.519445 
The average number of infected individuals is about 582932 being estimated with a standard error if 17586. The values of skewness and Kurtosis being very much higher than the desired range one can definitely say that the distributions will have high peaks and longer tails. For further analysis we need to work on a transformed data in order to get the reliable results. The basic variable by itself does not satisfy the Gaussian distribution. Hence a transformation will help in reducing the skewness and kurtosis value thereby making the variable satisfy the normal distributions and can be used for other statistical calculations. The range of the data is very large.
The GDP per person has an average value of 23486 being estimated with a standard deviation of 1902. The data exhibits a small amount of skewness and kurtosis. The distribution of the variable can be termed as asymptotically Normal. However a transformation can help in providing better insights for statistical analysis and techniques.
(ii) Estimate the following simple regression model of Cases on GDP:
Cases = b0 + b1GDP + u
Write down the estimated sample regression function and interpret both estimated coefficients.
Solution:
Regression Statistics 

Multiple R 
0.281878954 
R Square 
0.079455745 
Adjusted R Square 
0.073675398 
Standard Error 
2294367.428 
Observations 
174 
ANOVA 

df 
SS 
MS 
F 
Significance F 

Regression 
1 
7.86055E+13 
7.86E+13 
14.9323 
0.000157691 
Residual 
173 
9.10693E+14 
5.26E+12 

Total 
174 
9.89299E+14 
Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 

Intercept 
0 
#N/A 
#N/A 
#N/A 
#N/A 
#N/A 
GDP 
19.58937467 
5.06940753 
3.864234 
0.000157 
9.583523391 
29.59523 
Model: Cases = b0 + b1GDP + u
Cases =0+19.59GDP+Error
We observe that the linear relationship between the cases and GDP is around 7.9 or approximately 8%. The regression is significant as F(1,173)=14.932 and the p_value =0.0001<0.05. Hence, we say that the regression is significant at 5% level of significance. The model indicates that with every 1 AUD increase the number of infected cases increases by 19.5%
log(Cases) = b0 + b1 log(GDP) + u
Report your regression results in a sample regression function. Interpret the estimated coefficient of log(GDP). Provide an explanation on the sign of the slope coefficient.
Solution:
Regression Statistics 

Multiple R 
0.970323 
R Square 
0.941528 
Adjusted R Square 
0.935747 
Standard Error 
2.675318 
Observations 
174 
ANOVA 

df 
SS 
MS 
F 
Significance F 

Regression 
1 
19937.84 
19937.84 
2785.656 
3.551E108 
Residual 
173 
1238.217 
7.157324 

Total 
174 
21176.06 
Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 

Intercept 
0 
#N/A 
#N/A 
#N/A 
#N/A 
#N/A 
ln(GDP) 
1.122555 
0.021269 
52.77932 
1.4E108 
1.080575459 
1.164535 
This is a regression where both the GDP and Cases have been transformed. The logarithmic transformation has been used. This transformed relationship explains 94% of linear relationship among the variables. The regression is significant with F(1,173)=2786 and p_value<0.05.
Model: log(Cases) = b0 + b1 log(GDP) + u
Model : log(cases)=0+1.126* log(GDP)
We can say that the value of the intercept is zero. While, 1.126 can be termed as form of elasticity which is positive in nature. This implies with every 1% increase in GDP there is an increase of 1.126 percent in the infection cases in the linear form
however, the economic interpretation will be as follows:
With every $I AUD increase in GDP there will be an increase of exp(1.126) = 3.083298606 implying 8.3% increase in the infection cases.
log(Cases) = b0 + b1 log(GDP) + b2 log(POP) + u
Report your results in a sample regression function. Based on your estimates, how would you interpret the effect of POP on the number of cases? What can you conclude when you compare the goodness of fit of this regression model and that of the regression model in part (iii)?
Solution:
This is another kind of loglog relationship.
Regression Statistics 

Multiple R 
0.970465073 
R Square 
0.941802457 
Adjusted R Square 
0.935650146 
Standard Error 
2.67676773 
Observations 
174 
ANOVA 

df 
SS 
MS 
F 
Significance F 

Regression 
2 
19943.67 
9971.833 
1391.726 
1.5682E106 
Residual 
172 
1232.395 
7.165085 

Total 
174 
21176.06 
Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 

Intercept 
0 
#N/A 
#N/A 
#N/A 
#N/A 
#N/A 
ln(GDP) 
1.065870695 
0.066385 
16.05583 
4.88E36 
0.934835971 
1.196905 
ln(POP) 
0.126368468 
0.140185 
0.901444 
0.368613 
0.150335123 
0.403072 
The variables all used are logarithmic in nature. The transformed variables show hig values R^{2}. Hence the transformed variables produce a good fit for linear models. We observe that the variables GDP and POP turn out to be significant variables in estimating the cases of infection. The regression is significant at 5% level of significance as F(2,172)=1392 with p_value<0.05.
Model:
log(Cases) = b0 + b1 log(GDP) + b2 log(POP) + u
log(cases) =0+1.066* log(GDP)+ 0.127* log(POP) + u
with every 1 unit increase in POP there will be an increase of exp(0.127)= 1.13542 which implies an increase of 13.5% increase in the infected cases.
In comparison to the previous model(iii) there is not a substantial difference in R^{2 }or adj R^{2 }. Hence in out case there is not significant contribution due to an addition of the variable log(POP). Hence in terms of goodness of fit the previous model is able evaluate almost 94% of linear relationship.
Solution:
Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 

Intercept 
0 
#N/A 
#N/A 
#N/A 
#N/A 
#N/A 
ln(GDP) 
1.065870695 
0.066385 
16.05583 
4.88E36 
0.934835971 
1.196905 
ln(POP) 
0.126368468 
0.140185 
0.901444 
0.368613 
0.150335123 
0.403072 
We see the value of log(GDP) =1.06>1. We also observe that the p_value is approximately equal to 0. Since p<0.05 we reject the null hypothesis at 5% of significance and conclude that the coefficient of log(GDP) is definitely greater than 1.
log(Cases)= b0 + b1 log(GDP) +b2 Pop70 +b3 Diabetes + u
Interpret the coefficient of Pop70. Test whether Pop70 and Diabetes are jointly significant at 5% level of significance.
Solution:
Regression Statistics 

Multiple R 
0.972645 
R Square 
0.946037 
Adjusted R Square 
0.939558 
Standard Error 
2.585062 
Observations 
174 
ANOVA 

df 
SS 
MS 
F 
Significance F 

Regression 
3 
20033.35 
6677.782 
999.2875 
1.0722E107 
Residual 
171 
1142.715 
6.682544 

Total 
174 
21176.06 
Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 

Intercept 
0 
#N/A 
#N/A 
#N/A 
#N/A 
#N/A 
ln(GDP) 
1.232289 
0.06669 
18.47797 
1.31E42 
1.100647673 
1.363929 
Pop70 
0.053938 
0.054051 
0.997908 
0.319734 
0.052754826 
0.16063 
Diabetes 
0.17274 
0.054876 
3.14777 
0.001942 
0.281057747 
0.06442 
The coefficient of Pop70 is 0.054 which is insignificant in the Model. This implies that there is is no significant contribution of the variable Pop70 in terms of producing an increase in the rate of infection.
Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 

Intercept 
2.474843133 
1.910687616 
1.295263084 
0.196996 
1.29705 
6.246732 
ln(GDP) 
0.910243025 
0.252205332 
3.609134735 
0.000404 
0.412364 
1.408122 
Pop70 
0.146710908 
0.144840864 
1.01291102 
0.312551 
0.13922 
0.432641 
Diabetes 
0.128867729 
0.08948488 
1.440106183 
0.151687 
0.30552 
0.047784 
pop70_diabetes 
0.00610965 
0.018036742 
0.33873358 
0.735231 
0.04172 
0.029497 
We observe that Pop70 and Diabetes are not jointly significant because the p_value corresponding to the joint variable is 0.73>0.05. Hence they joint impact can be termed as insignificant at 5% level of significance in impacting the increasing rate of covid spread.
log(Cases)= b0 + b1 log(GDP)+ b2 log(POP)+b3 Oceania + u
Report your regression results in a sample regression function. Interpret the meaning of the coefficient for Oceania.
Solution:
Regression Statistics 

Multiple R 
0.978114 
R Square 
0.956706 
Adjusted R Square 
0.950352 
Standard Error 
2.31546 
Observations 
174 
ANOVA 

df 
SS 
MS 
F 
Significance F 

Regression 
3 
20259.27 
6753.09 
1259.587 
7.9E116 
Residual 
171 
916.7915 
5.361354 

Total 
174 
21176.06 
Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 

Intercept 
0 
#N/A 
#N/A 
#N/A 
#N/A 
#N/A 
ln(GDP) 
1.146681 
0.058383 
19.64081 
1.07E45 
1.031438 
1.261925 
ln(POP) 
0.012706 
0.122164 
0.10401 
0.917283 
0.22844 
0.25385 
Oceania 
6.46518 
0.84265 
7.67244 
1.23E12 
8.12852 
4.80184 
This linear regression model is good linear fit with 95% of linear relationship being explained. The variables GDP and Oceania are significant as the p_values<0.05. Hence these 2 variables have their contribution in predicting the infection rate
The coefficient of Oceania is 6.47 indicating that the elasticity is negative. Hence with every 1 individual added from oceania there is a decrease in the rate of infection by exp(6.47)= 0.0015 which means an increase in rate of infection by 0.1% occurs.
Solution:
Regression Statistics 

Multiple R 
0.62787987 
R Square 
0.394233131 
Adjusted R Square 
0.383543128 
Standard Error 
2.30696728 
Observations 
174 
ANOVA 

df 
SS 
MS 
F 
Significance F 

Regression 
3 
588.8157 
196.2719 
36.87867 
2.0737E18 
Residual 
170 
904.7567 
5.322098 

Total 
173 
1493.572 
Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 
Lower 99.0% 
Upper 99.0% 

Intercept 
2.210725098 
1.470129 
1.503763 
0.134498 
0.6913336 
5.112784 
1.61905 
6.040496 
ln(GDP) 
0.945645462 
0.145795 
6.486126 
9.23E10 
0.65784349 
1.233447 
0.565841 
1.32545 
ln(POP) 
0.050070733 
0.128676 
0.38912 
0.697673 
0.3040798 
0.203938 
0.38528 
0.285138 
Oceania 
6.635032667 
0.847123 
7.83243 
4.94E13 
8.3072683 
4.9628 
8.84184 
4.42823 
This linear regression model is good linear fit with 95% of linear relationship being explained. The variables GDP and Oceania are significant as the p_values<0.01. Hence these 2 variables have their contribution in predicting the infection rate
The coefficient of Oceania is 6.64 indicating that the elasticity is negative. Hence with every 1 individual added from oceania there is a decrease in the rate of infection by exp(6.64)= 0.001307 which means an increase in rate of infection by 0.1% occurs
log(Cases)= b0 + b1 log(GDP)+ b2 log(POP)+b3 Europe+ u
Test whether Europe has a significant effect at the 1% level of significance. What do you infer about the explanatory power of the model in part (ix) compared to the model that you estimated in part (vii)?
Solution:
Regression Statistics 

Multiple R 
0.971197 
R Square 
0.943225 
Adjusted R Square 
0.936713 
Standard Error 
2.65158 
Observations 
174 
ANOVA 

df 
SS 
MS 
F 
Significance F 

Regression 
3 
19973.78 
6657.927 
946.9552 
8.1E106 
Residual 
171 
1202.28 
7.030879 

Total 
174 
21176.06 
Coefficients 
Standard Error 
t Stat 
Pvalue 
Lower 95% 
Upper 95% 
Lower 99.0% 
Upper 99.0% 

Intercept 
0 
#N/A 
#N/A 
#N/A 
#N/A 
#N/A 
#N/A 
#N/A 
ln(GDP) 
1.02528 
0.068623 
14.94074 
7.79E33 
0.889822 
1.160737 
0.846524 
1.204035 
ln(POP) 
0.156855 
0.139645 
1.123242 
0.262909 
0.11879 
0.432504 
0.2069 
0.520613 
Europe 
1.036708 
0.500926 
2.069582 
0.039995 
0.047913 
2.025502 
0.26815 
2.341562 
The Europe variable is insignificant as the p_value =0.0399>0.01. Hence at 1% level of significance we can conclude that this particular variable has no contributing in terms of increasing or decreasing the infection rate.
In comparision to previous model if an indicidual is from Oceania there is an impact on the rate of infection. However, that is not the case if the individual is from Europe. Hence an individual being from Europe produces no impact on the rate of infection designed in this linear model.
Remember, at the center of any academic work, lies clarity and evidence. Should you need further assistance, do look up to our Economics Assignment Help
1,212,718Orders
4.9/5Rating
5,063Experts
Turnitin Report
$10.00Proofreading and Editing
$9.00Per PageConsultation with Expert
$35.00Per HourLive Session 1on1
$40.00Per 30 min.Quality Check
$25.00Total
FreeGet
500 Words Free
on your assignment today
Get
500 Words Free
on your assignment today
Request Callback
Doing your Assignment with our resources is simple, take Expert assistance to ensure HD Grades. Here you Go....