Regression: Introduction, Lines, Equation, Coefficient
Regression
Definition:
- Regression is a statistical method used to examine the relationship between a dependent variable (Y) and one or more independent variables (X).
- It is widely used in finance, investing, and other fields to predict outcomes and understand variable relationships.
Types of Regression:
- Linear Regression:
- Uses one independent variable to predict the dependent variable.
- Formula:
- Multiple Linear Regression:
- Uses two or more independent variables to predict the dependent variable.
- Formula:
Components:
- Y: Dependent variable.
- X: Independent variable(s).
- a: Intercept.
- b: Slope.
- u: Residual (error term).
Uses:
- Valuing assets.
- Predicting sales.
- Pricing models like CAPM.
Assumptions in Regression:
- Independence:
- Residuals are serially independent.
- Residuals are not correlated with independent variables.
- Linearity:
- The relationship between the dependent and independent variables is linear.
- Mean of Residuals:
- The mean of residuals is zero.
- Homogeneity of Variance:
- Constant variance of residuals at all levels of independent variables.
- Errors in Variables:
- Independent variables are measured without error.
- Model Specification:
- All relevant variables are included, and no irrelevant variables are included.
- Normality:
- Residuals are normally distributed (important for significance tests).
⭐Regression Line
Definition:
- A regression line is the line that best fits the data points on a graph, minimizing the squared deviations of the predictions from the actual data points.
Types:
- Regression Line of Y on X:
- Predicts values of Y from given values of X.
- Regression Line of X on Y:
- Predicts values of X from given values of Y.
Characteristics:
- The correlation between variables is reflected in the distance between the regression lines.
- When the regression lines coincide, there is a perfect positive or negative correlation.
- Independent variables result in zero correlation and the regression lines being at right angles.
Key Points:
- Regression lines intersect at the point of the average values of X and Y.
- The intersection point reflects the mean values of X and Y on their respective axes.
Additional Information:
- Regression analysis helps in making informed decisions based on data-driven insights.
- Understanding the assumptions is crucial for accurate interpretation and reliable predictions.
⭐Lines of Regression
Definition:
Regression lines are lines that best fit the data points, minimizing the squared deviations of predictions from the actual values. They represent the relationship between two variables, X and Y.
Types:
- Regression Line of Y on X:
- Predicts values of Y based on given values of X.
- Regression Line of X on Y:
- Predicts values of X based on given values of Y.
Regression Equations:
- Each regression line has an algebraic expression called a regression equation.
- Example:
- Regression line of Y on X: Y = a + bX
- Regression line of X on Y: X = c + dY
Correlation:
- The distance between the two regression lines indicates the degree of correlation.
- High Correlation: Regression lines are close.
- Low Correlation: Regression lines are far apart.
- Perfect Correlation: Lines coincide (one line).
- Zero Correlation: Lines are at right angles.
Intersection Point:
- The regression lines intersect at the point representing the average values of X and Y.
Coefficient of Regression
Definition: The regression coefficient, denoted by b, measures the change in the dependent variable (Y) for a unit change in the independent variable (X). It represents the slope of the regression line.
Types:
- Regression Coefficient of X on Y :
- Measures the change in X for a unit change in Y.
- Formula (from actual means):
- Formula (from assumed means):
- Regression Coefficient of Y on X :
- Measures the change in Y for a unit change in X.
- Formula (from actual means):
- Formula (from assumed means):
Key Points:
- Slope Coefficient: The regression coefficient is also known as the slope coefficient, indicating the slope of the regression line.
- Interpretation: It indicates how much the dependent variable changes for a one-unit change in the independent variable.
Additional Information
Practical Uses:
- Regression analysis is widely used in finance, economics, and other fields to make predictions and understand relationships between variables.
- It can help predict future values, identify trends, and make informed decisions based on historical data.
Assumptions:
- Ensure that the assumptions of regression (linearity, independence, homoscedasticity, etc.) are met for accurate and reliable results.
⭐Properties of Regression Coefficients
Definition: The constant ‘b’ in the regression equation Y=a + bX is called the Regression Coefficient or Slope Coefficient. It indicates the change in the value of Y for a unit change in X.
Properties:
- Geometric Mean Relationship:
- The correlation coefficient (r) is the geometric mean of the two regression coefficients.
- Range Limitation:
- The value of the correlation coefficient cannot exceed 1. If one regression coefficient is greater than 1, the other must be less than 1.
- Sign Consistency:
- Both regression coefficients have the same sign, either positive or negative. It is not possible for one to be positive and the other to be negative.
- Sign of Correlation:
- The sign of the correlation coefficient (r) is the same as that of the regression coefficients. If the regression coefficients are positive, r is positive, and if negative, r is negative.
- Average Value:
- The average value of the two regression coefficients is always greater than the value of the correlation coefficient.
- Independence from Origin:
- Regression coefficients are independent of the change of origin. Subtracting any constant from X and Y does not affect the regression coefficients.
- Dependence on Scale:
- Regression coefficients are dependent on the change of scale. If X and Y are multiplied or divided by a constant, the regression coefficients will change accordingly.