Topics learnt in today’s class: (Simple Linear Regression)
1: Skewness – The term “skewness” in simple linear regression describes the asymmetry of the residuals’ distribution, which might have an effect on the reliability of our regression model and how we should interpret its findings. To verify the accuracy of our regression analysis, it’s critical to look for skewness in the residuals and, if necessary, take the proper remedial action.
2: Kurtosis – In simple linear regression, kurtosis describes how the residuals are distributed and might reveal whether they have longer or shorter tails when compared to a normal distribution. When analysing regression data, it is crucial to consider kurtosis since severe kurtosis can affect the validity of our regression results and necessitate corrective actions to assure the accuracy of our research. For a normal distribution kurtosis is 3 but in the diabetes dataset we have the kurtosis values as 4 which is not exactly a normal distribution.
3: Heteroscedasticity – The differences between the observed values of the dependent variable and the values predicted by the regression model is known as heteroscedasticity. In other words, the spread of the residuals changes as you move along the values of the independent variable. Inferences about the statistical significance of the regression coefficients may be made incorrectly as a result of heteroscedasticity. Standard errors, in particular, may be under- or over-estimated, which has an impact on the accuracy of parameter estimations and can result in inaccurate assessments of the significance of predictors. Least squares estimations may no longer be the most effective estimators of the regression coefficients when heteroscedasticity is present. The statistical power of our analysis can be decreased by inefficient estimation.