# Computing Coefficient of Correlation: Calculation Process

_{December 10, 2023 by JoyAnswer.org, Category : Mathematics}

How do you calculate coefficient of correlation? Learn the process of calculating the coefficient of correlation. This article provides guidance on computing correlation coefficients between variables.

## How do you calculate coefficient of correlation?

The coefficient of correlation, often denoted as $r$, is a statistical measure that quantifies the strength and direction of a linear relationship between two variables. There are different methods to calculate the coefficient of correlation, but one of the most common is Pearson's correlation coefficient. Here's how you can calculate it:

### Pearson's Correlation Coefficient ($r$):

#### Step 1: Understand the Formula:

The formula for Pearson's correlation coefficient is:

$r = \frac{\sum{(X_i - \bar{X})(Y_i - \bar{Y})}}{\sqrt{\sum{(X_i - \bar{X})^2} \cdot \sum{(Y_i - \bar{Y})^2}}}$

Where:

- $X_i$ and $Y_i$ are the individual data points.
- $\bar{X}$ and $\bar{Y}$ are the means of the X and Y datasets, respectively.

#### Step 2: Calculate the Means:

Calculate the mean ($\bar{X}$) and ($\bar{Y}$) of the X and Y datasets.

$\bar{X} = \frac{\sum{X_i}}{n}$$\bar{Y} = \frac{\sum{Y_i}}{n}$

Where $n$ is the number of data points.

#### Step 3: Calculate the Numerator:

$\sum{(X_i - \bar{X})(Y_i - \bar{Y})}$

Subtract the mean of X from each data point in X, and the mean of Y from each data point in Y. Multiply these differences for each corresponding pair and sum them up.

#### Step 4: Calculate the Denominator:

$\sqrt{\sum{(X_i - \bar{X})^2} \cdot \sum{(Y_i - \bar{Y})^2}}$

Calculate the square of the differences between each data point and the mean for both X and Y. Sum these squared differences for each dataset and take the square root of the product of these sums.

#### Step 5: Calculate the Coefficient of Correlation:

$r = \frac{\text{Numerator}}{\text{Denominator}}$

Divide the numerator by the denominator to get the coefficient of correlation ($r$).

### Interpretation of $r$:

- $r = 1$: Perfect positive correlation
- $r = -1$: Perfect negative correlation
- $r = 0$: No correlation

### Note:

- Pearson's correlation coefficient assumes a linear relationship. It may not accurately represent non-linear relationships.
- This method is sensitive to outliers.

Computing the coefficient of correlation can be done using statistical software, spreadsheets, or programming languages like Python, R, or MATLAB, where built-in functions or libraries are available for such calculations.

## What is the formula for computing the coefficient of correlation?

The formula for computing the Pearson correlation coefficient between two arrays `x`

and `y`

is:

```
correlation = covariance / (std_dev_x * std_dev_y)
```

where:

`covariance`

is the sum of the product of the centered data points, divided by the number of data points:

```
covariance = np.sum(centered_x * centered_y) / len(x)
```

`centered_x`

is the array`x`

with the mean subtracted from each element:

```
centered_x = x - np.mean(x)
```

`centered_y`

is the array`y`

with the mean subtracted from each element:

```
centered_y = y - np.mean(y)
```

`std_dev_x`

is the standard deviation of the array`x`

:

```
std_dev_x = np.std(x)
```

`std_dev_y`

is the standard deviation of the array`y`

:

```
std_dev_y = np.std(y)
```

Here is an example of how to calculate the correlation coefficient between two NumPy arrays:

```
import numpy as np
x = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 5, 4, 3])
correlation = np.corrcoef(x, y)[0, 1]
print(correlation)
```

Use code with caution. Learn moreThis code outputs the following:

```
0.2773500981126146
```

The correlation coefficient is a measure of the linear relationship between two variables. It is a number between -1 and 1, where:

- -1 indicates a perfect negative correlation.
- 0 indicates no linear correlation.
- 1 indicates a perfect positive correlation.