Suppose we want to investigate whether there is a relationship between the marriage rate and the divorce rate in different regions. We can use the Pearson correlation coefficient to test for a linear relationship between these two variables. Here is how we set up the correlation test and perform the analysis.
Step 1: State the Hypotheses
The null hypothesis and the alternative hypothesis
can be stated as follows:
where is the population correlation coefficient.
Step 2: Collect Data
Suppose we have the following sample data for the marriage rate and divorce rate (per 1000 people) in 10 different regions:

Step 3: Calculate the Correlation Coefficient
The Pearson correlation coefficient is calculated using the formula:
where represents the marriage rate and
represents the divorce rate.
After calculating, we find that:
Step 4: Determine the Degrees of Freedom
The degrees of freedom for the correlation test is:
Step 5: Compare the Test Statistic to the Critical Value
Using a t-table or calculator, find the critical t-value for a two-tailed test with and
. The critical t-value is approximately
.
We convert the correlation coefficient to a t-statistic:
Since is greater than
, we reject the null hypothesis.
Conclusion
At the significance level, we have enough evidence to conclude that there is a significant linear relationship between the marriage rate and the divorce rate in different regions. Therefore, we reject the null hypothesis that there is no linear relationship between these two variables.
Codes
python
import numpy as np
from scipy import stats
# Data
marriage_rate = np.array([8.2, 7.8, 9.0, 6.9, 7.4, 8.6, 7.1, 8.0, 7.3, 8.4])
divorce_rate = np.array([4.3, 4.1, 4.8, 3.7, 3.9, 4.5, 3.8, 4.2, 3.9, 4.4])
# Calculate Pearson correlation coefficient and p-value
correlation_coefficient, p_value = stats.pearsonr(marriage_rate, divorce_rate)
# Calculate degrees of freedom
n = len(marriage_rate)
degrees_of_freedom = n - 2
# Calculate t-statistic
t_statistic = correlation_coefficient * np.sqrt(degrees_of_freedom / (1 - correlation_coefficient**2))
# Print results
print(f"Pearson correlation coefficient: {correlation_coefficient}")
print(f"p-value: {p_value}")
print(f"t-statistic: {t_statistic}")
print(f"Degrees of freedom: {degrees_of_freedom}")
R
# Data
marriage_rate <- c(8.2, 7.8, 9.0, 6.9, 7.4, 8.6, 7.1, 8.0, 7.3, 8.4)
divorce_rate <- c(4.3, 4.1, 4.8, 3.7, 3.9, 4.5, 3.8, 4.2, 3.9, 4.4)
# Calculate Pearson correlation coefficient and p-value
correlation_test <- cor.test(marriage_rate, divorce_rate)
# Calculate degrees of freedom
n <- length(marriage_rate)
degrees_of_freedom <- n - 2
# Calculate t-statistic
t_statistic <- correlation_test$estimate * sqrt(degrees_of_freedom / (1 - correlation_test$estimate^2))
# Print results
cat("Pearson correlation coefficient:", correlation_test$estimate, "\n")
cat("p-value:", correlation_test$p.value, "\n")
cat("t-statistic:", t_statistic, "\n")
cat("Degrees of freedom:", degrees_of_freedom, "\n")
Discover more from Science Comics
Subscribe to get the latest posts sent to your email.