Karl Pearson’s coefficient of correlation
Measurement of Correlation: Karl Pearson’s Method and Spearman Rank Correlation
Karl Pearson’s Method (Pearson Correlation):
- Definition: Pearson correlation coefficient (r) quantifies the degree and direction of a linear relationship between two variables, X and Y.
- Range and Interpretation:
- r = +1: Perfect positive correlation
- r = −1: Perfect negative correlation
- r = 0: No correlation
- Properties:
- Independent of origin and scale.
- Geometric mean of two regression coefficients.
- Assumes a linear relationship between variables.
- Formula:
- Assumptions:
- Linearity of the relationship.
- Normal distribution of variables.
- Independence of variables.
Spearman Rank Correlation:
- Definition: Non-parametric measure assessing monotonic relationships between variables, suitable for ordinal data.
- Formula: Where ρ is the Spearman rank correlation, di is the difference in ranks between corresponding variables, and n is the number of observations.
- Interpretation:
- ρ = + 1: Perfect monotonic positive relationship.
- ρ = − 1: Perfect monotonic negative relationship.
- Assumptions:
- Ordinal data.
- Monotonic relationship between variables.
Key Differences:
- Pearson measures linear relationships, while Spearman assesses monotonic relationships.
- Pearson requires interval or ratio data; Spearman can handle ordinal data.
- Pearson is sensitive to outliers; Spearman is less affected.
Applications:
- Pearson: Used extensively in fields requiring analysis of linear associations, like economics and social sciences.
- Spearman: Appropriate when data are ranked or when assumptions of normality are violated.
Understanding these methods helps in choosing the right tool to analyze relationships between variables based on data characteristics and research objectives.