2 Oct

Until today I learned about simple linear regression, multiple linear regression, about p-value, t-tests, cross-validation and k-fold test.  Now I am learning how to use these techniques in the project.

The data which is provided consists of entries of diabetes, obesity and inactivity of 2018 in each state in the country. We can see that there is unique FIPS number for each state. There are 3142 entries in diabetes, 363 entries in obesity and 1370 entries in inactivity. We can see that there are 354 common entries for all three sets. We copied all the common entries in a single spreadsheet.

To use simple linear regression, we need one dependent and one independent variable. So we have two cases for applying simple linear regression by keeping diabetes as dependent variable and keeping obesity as independent variable for one case and keeping inactivity as independent variable for another case. For multiple linear regression, we keep diabetes as dependent variable and keeping both inactivity and obesity as two independent variables.

In the next update, I will update the plots and the models of simple linear regression and multiple linear regression.

Leave a Reply

Your email address will not be published. Required fields are marked *