Yes. It is free with no signup, no rate limit, and no paywall on advanced features.

Does it store my inputs?

No. Every calculation runs in your browser using JavaScript. The numbers and text you type never leave your device; we do not log or transmit form values.

Where do the formulas come from?

Formulas come from the canonical reference for the domain (the relevant tax authority, standards body, or peer-reviewed source). See /methodology for the full sourcing process and per-page citations.

Pearson Correlation Calculator (r)

Measure the linear relationship between two paired numeric variables.

Paste your X and Y data below. Values can be separated by spaces, commas, tabs, or new lines. Pairs match by index.

Pearson r-

r squared-

Sample size (n)-

Mean X-

Mean Y-

Strength-

How is this calculated?

Formula: r = sum((xi - mean_x)(yi - mean_y)) / sqrt(sum((xi - mean_x)^2) sum((yi - mean_y)^2)). The result ranges from -1 (perfect negative) to +1 (perfect positive). r squared is the share of Y variance explained by X under a linear fit. Source: Pearson 1895, standard statistics textbooks.

About

The Pearson correlation coefficient r is a standardised measure of the linear relationship between two paired numeric variables. It ranges from -1 (perfect inverse) to +1 (perfect direct), with 0 indicating no linear relationship. The tool computes r, r squared, and basic descriptive statistics from any two equal-length arrays.

How it works

r is the covariance of two variables divided by the product of their standard deviations. Geometrically, it is the cosine of the angle between the mean-centred X and Y vectors. Algebraically:

r = cov(X, Y) / (sigma_X * sigma_Y)

  = sum_i [(x_i - mean(X)) * (y_i - mean(Y))]
    -----------------------------------------------------
    sqrt( sum_i (x_i - mean(X))^2 * sum_i (y_i - mean(Y))^2 )

r squared = proportion of Y variance explained by linear X

Both X and Y must be on interval or ratio scales (numeric, evenly spaced). The formula is symmetric: r(X,Y) = r(Y,X). It assumes the relationship is linear; a perfect quadratic curve can produce r ~ 0.

Worked example

A teacher records hours of weekly study and final exam scores for 5 students: study X = [2, 4, 6, 8, 10]; scores Y = [58, 65, 72, 78, 87].

Means: mean(X) = 6, mean(Y) = 72.
Deviations: X - mean = [-4, -2, 0, 2, 4]; Y - mean = [-14, -7, 0, 6, 15].
Cross-product sum: 56 + 14 + 0 + 12 + 60 = 142.
Sum of squares X: 16 + 4 + 0 + 4 + 16 = 40. Sum of squares Y: 196 + 49 + 0 + 36 + 225 = 506.
Apply formula: r = 142 / sqrt(40 x 506) = 142 / sqrt(20,240) = 142 / 142.27 = 0.9981.
r squared: 0.9962, so 99.6 percent of score variance is linearly explained by study hours.

Result: r = 0.998, a near-perfect positive linear relationship. With n = 5, that r is statistically significant beyond p = 0.001. But correlation alone does not prove studying causes higher scores, only that they move together.

Reference table

Common interpretation thresholds for the absolute value |r|, after Cohen (1988):

\|r\|	Verbal label (Cohen)	r squared (var. explained)	Field example
0.00-0.09	Negligible	< 1 percent	Coin flips vs weather
0.10-0.29	Small / weak	1 to 8 percent	Personality trait predicting outcomes
0.30-0.49	Medium / moderate	9 to 24 percent	Education and income
0.50-0.69	Large / strong	25 to 48 percent	Height of parents vs children
0.70-0.89	Very strong	49 to 79 percent	SAT verbal vs SAT math
0.90-0.99	Near perfect	81 to 98 percent	Twin IQs (monozygotic)
1.00	Perfect linear	100 percent	F = C x 9/5 + 32

Common pitfalls

Confusing correlation with causation. Two variables can correlate strongly because both depend on a third (the "lurking variable"). Ice-cream sales and drownings correlate via summer temperature.
Missing non-linear patterns. r near 0 only rules out a linear pattern. A perfect U-shape (y = x^2 centred on zero) gives r = 0.
Outliers. One extreme point can drag r from 0 to 0.9 or vice versa. Plot the scatter first; consider Spearman or robust alternatives if outliers are real.
Aggregating to ecological correlations. Group-level r is often much larger than individual-level r (Simpson's paradox). Always check the unit of analysis.
Statistical significance vs effect size. With n = 10,000, r = 0.02 is "statistically significant" but explains 0.04 percent of variance. Report both r and the confidence interval.
Truncated range. Selecting only top scorers compresses Y's variance and shrinks r toward 0 (range restriction).

Related tools and glossary

Chi-Square Calculator Standard Deviation Linear Regression Mean / Median / Mode

Frequently asked questions

What does the Pearson r value mean?

Pearson r ranges from -1 to +1 and measures the strength and direction of a linear relationship between two variables. r = +1 is a perfect positive linear fit, 0 is no linear relationship, and -1 is a perfect negative fit. Cohen's 1988 conventions classify |r| as small (0.10), medium (0.30), and large (0.50), but the practical interpretation depends on the field.

What is the difference between r and r squared?

r is the correlation coefficient. r squared (the coefficient of determination) is the share of variance in Y explained by a linear fit on X. r = 0.7 implies r squared = 0.49, so 49 percent of Y variance is linearly explained by X, leaving 51 percent unexplained or noise.

Does a high r mean X causes Y?

No. Correlation is symmetric (r(X,Y) = r(Y,X)) and reflects only co-variation. Causal claims require a research design that rules out confounders, reverse causation, and selection (e.g. randomized experiment, instrumental variable, or natural experiment). Ice-cream sales and drowning rates correlate strongly through summer temperature.

When should I use Spearman or Kendall instead of Pearson?

Use Spearman's rho or Kendall's tau when the relationship is monotonic but not linear, when data are ordinal, or when outliers dominate Pearson r. Pearson assumes bivariate normality and constant variance; both are violated by heavy-tailed financial data, where rank-based correlations give more stable estimates.

Sources

Pearson K. (1895), Notes on regression and inheritance in the case of two parents, Proceedings of the Royal Society.
Cohen J. (1988), Statistical Power Analysis for the Behavioral Sciences, 2nd ed., Lawrence Erlbaum (effect-size thresholds).
Wasserman L. (2004), All of Statistics, Springer (chapter 14: linear regression and correlation).
NIST/SEMATECH e-Handbook of Statistical Methods, section 7.2.6 (correlation interpretation).

Last updated 2026-05-28.