Descriptive Statistics: Central Tendency and Dispersion

Published: May 9, 2026

Article by Ryan O'Connell, CFA, FRM

Descriptive statistics are the foundation of quantitative analysis in finance. Whether you’re evaluating a stock’s historical returns, comparing two mutual funds, or assessing portfolio risk, you need these tools to summarize raw data into actionable insights. This guide covers everything you need to know — measures of central tendency, dispersion metrics, when to use each, and where they can mislead you.

What Are Descriptive Statistics?

Descriptive statistics are numerical measures that summarize and describe the main features of a dataset. In finance, they transform thousands of daily or monthly returns into a handful of numbers that capture a distribution’s center (central tendency) and spread (dispersion).

Key Concept

Descriptive statistics answer “what happened?” — they summarize historical data. This is distinct from inferential statistics, which use sample data to make predictions or test hypotheses about a larger population.

Financial analysts rely on descriptive statistics for several core tasks: comparing the average returns of different asset classes, measuring how much a portfolio’s returns vary from period to period, benchmarking a fund’s performance against an index, and identifying outliers that may signal data errors or unusual market events.

The two main categories are measures of central tendency (mean, median, mode, geometric mean) and measures of dispersion (range, mean absolute deviation, variance, standard deviation, coefficient of variation). Understanding when to use each is essential for accurate financial analysis.

Measures of Central Tendency

Central tendency measures describe where the “center” of a distribution lies. Different measures are appropriate for different situations.

Arithmetic Mean

The arithmetic mean is the sum of all values divided by the number of values — the familiar average. It’s the most commonly used measure of central tendency in finance, particularly for single-period expected returns.

Arithmetic Mean

x̄ = Σx_i / n

Sum of all observations divided by the number of observations

The arithmetic mean is easy to calculate and uses all available data. However, it’s sensitive to extreme values — a single outlier can pull the mean significantly away from where most observations cluster.

Median

The median is the middle value when observations are sorted from lowest to highest. For an even number of observations, it’s the average of the two middle values. The median is more robust to outliers than the arithmetic mean.

In finance, the median is particularly useful for skewed distributions. For example, when analyzing household wealth or hedge fund returns, where a few extreme values can distort the arithmetic mean, the median often provides a more representative picture of the typical observation.

Mode

The mode is the most frequently occurring value in a dataset. It’s most useful for discrete data, such as credit ratings, where you might want to know the most common rating in a bond portfolio. For continuous return data, the mode is rarely used because values seldom repeat exactly.

Geometric Mean

The geometric mean measures the average compound growth rate over multiple periods. It’s essential for multi-period investment returns because it accounts for the compounding effect that the arithmetic mean ignores.

Geometric Mean Return

G = [(1 + R₁) × (1 + R₂) × … × (1 + R_n)]^1/n – 1

The nth root of the product of gross returns, minus one

A key property: the geometric mean is always less than or equal to the arithmetic mean, with equality only when all values are identical. The greater the variability in returns, the larger this gap becomes.

Measure	Best Used For	Limitation
Arithmetic Mean	Single-period expected returns, cross-sectional comparisons	Sensitive to outliers; overstates multi-period growth
Median	Skewed distributions, presence of outliers	Ignores distribution shape; less efficient statistically
Mode	Discrete data (ratings, categories)	Often undefined for continuous data
Geometric Mean	Multi-period compound returns	Returns -100% if any period has total loss

Measures of Dispersion

While central tendency tells you where returns cluster, dispersion measures tell you how spread out they are. In finance, dispersion is directly related to risk — wider dispersion means greater uncertainty.

Range

The range is the simplest dispersion measure: the difference between the maximum and minimum values. While easy to calculate, it uses only two data points and is extremely sensitive to outliers.

Mean Absolute Deviation (MAD)

The mean absolute deviation is the average of the absolute differences between each observation and the mean. It provides a more intuitive measure of “typical” deviation than variance because it’s in the same units as the original data.

Mean Absolute Deviation

MAD = Σ|x_i – x̄| / n

Average of the absolute deviations from the mean

Note: The abbreviation MAD is also used for median absolute deviation (deviations from the median), which is more resistant to outliers. This article uses the mean-based version, which is more common in introductory finance applications.

Variance

Variance is the average of squared deviations from the mean. Squaring gives more weight to larger deviations and ensures all deviations are positive. The trade-off is that variance is expressed in squared units, making direct interpretation difficult.

Sample Variance

s² = Σ(x_i – x̄)² / (n – 1)

Sum of squared deviations divided by (n – 1) for sample data

Standard Deviation

Standard deviation is the square root of variance, bringing the measure back to the original units (percentage points for returns). It’s the most widely used dispersion measure in finance. For a detailed treatment of standard deviation as a risk metric — including annualization and portfolio applications — see our guide on standard deviation in finance.

Coefficient of Variation

The coefficient of variation (CV) is the ratio of standard deviation to the mean, expressed as a percentage or decimal. It measures relative dispersion, allowing comparison across datasets with different scales or units.

Coefficient of Variation

CV = s / x̄

Standard deviation divided by the mean

CV Caution

The coefficient of variation is unreliable when the mean is close to zero, negative, or when comparing returns measured over different time horizons. Use it only when the mean is meaningfully positive and the data are measured consistently.

Sample vs Population Statistics

In statistics, a population includes every possible observation, while a sample is a subset drawn from the population. In finance, you almost always work with samples — you have historical returns for some period, not every possible return the asset could generate.

The distinction affects how you calculate variance and standard deviation:

Population Statistics

Divide by N (total observations)
Symbols: μ (mean), σ (standard deviation)
Used when you have the entire dataset
Rare in finance — requires complete information

Sample Statistics

Variance/std dev divide by n – 1 (Bessel’s correction)
Symbols: x̄ (mean), s (standard deviation)
Used when estimating from historical data
Standard practice in financial analysis

Bessel’s correction (dividing by n – 1 instead of n) adjusts for the fact that a sample tends to underestimate the true population variance. Using n – 1 provides an unbiased estimate of the population variance.

Quantiles and Percentiles

Quantiles divide an ordered dataset into equal parts. The most common are:

Quartiles: Q1 (25th percentile), Q2 (median, 50th percentile), Q3 (75th percentile)
Deciles: Divide data into 10 equal parts
Percentiles: Divide data into 100 equal parts

In finance, percentiles are used to benchmark performance (a fund in the 90th percentile outperformed 90% of peers), analyze return distribution tails (the 5th percentile for downside risk), and construct quantile-based portfolios for factor investing.

Pro Tip

Different software packages use different interpolation methods for percentiles, which can produce slightly different results — especially with small samples. Excel’s PERCENTILE.INC and PERCENTILE.EXC functions use different conventions, and Python’s numpy uses yet another method. When precision matters, verify which interpolation your tool uses.

Interpreting Descriptive Statistics

Each descriptive statistic has strengths and blind spots. Use them together for a complete picture:

Metric	What It Tells You	Watch Out For
Arithmetic Mean	Average return level	Outliers pull it away from the typical value
Geometric Mean	Compound growth rate	Returns -100% if any period has total loss
Median	Typical value, resistant to outliers	Ignores distribution shape and tails
Standard Deviation	Typical spread around the mean	Treats gains and losses symmetrically
Coefficient of Variation	Risk per unit of return	Meaningless if mean is near zero or negative
Percentiles	Position within a distribution	Requires sufficient sample size for reliability

Pro Tip

Always compare statistics calculated over the same time frequency. A monthly standard deviation cannot be directly compared to an annual figure — you must annualize first. See standard deviation in finance for annualization methods.

Descriptive Statistics Example

Let’s calculate key descriptive statistics for a stock’s 12 monthly returns.

Complete Calculation Example

A stock produced the following monthly returns:

Month	Return
Jan	+4%
Feb	-2%
Mar	+6%
Apr	+1%
May	-3%
Jun	+5%
Jul	+2%
Aug	-1%
Sep	+7%
Oct	+3%
Nov	-4%
Dec	+2%

Step 1: Arithmetic Mean

Sum = 4 + (-2) + 6 + 1 + (-3) + 5 + 2 + (-1) + 7 + 3 + (-4) + 2 = 20%

Arithmetic Mean = 20% / 12 = 1.67%

Step 2: Median

Sorted: -4, -3, -2, -1, +1, +2, +2, +3, +4, +5, +6, +7

Middle values (6th and 7th): +2 and +2

Median = (2 + 2) / 2 = 2.00%

Step 3: Geometric Mean

Product of gross returns = 1.04 × 0.98 × 1.06 × 1.01 × 0.97 × 1.05 × 1.02 × 0.99 × 1.07 × 1.03 × 0.96 × 1.02 = 1.2111

Geometric Mean = 1.2111^1/12 – 1 = 1.61%

Step 4: Sample Variance

Sum of squared deviations from mean (1.67%) = 140.67

Sample Variance = 140.67 / 11 = 12.79 (percentage points squared)

Step 5: Sample Standard Deviation

s = √12.79 = 3.58%

Step 6: Coefficient of Variation

CV = 3.58% / 1.67% = 2.14

Interpretation: The stock averaged 1.67% per month with a geometric mean of 1.61% (compound growth rate). The 0.06% difference reflects the drag from return volatility. A CV of 2.14 means the monthly standard deviation is about 2.14 times the average monthly return.

Arithmetic Mean vs Geometric Mean

Understanding when to use arithmetic versus geometric mean is one of the most important distinctions in investment analysis.

Arithmetic Mean

Simple average of returns
Best for single-period expected returns
Overstates multi-period compound growth
Used in CAPM and forward-looking projections
Always ≥ geometric mean

Geometric Mean

Compound average growth rate
Accurate for multi-period realized returns
Accounts for the effect of volatility
Used for historical performance reporting
Always ≤ arithmetic mean

Why the Difference Matters

Consider an investment with two annual returns: +50% in Year 1, -50% in Year 2.

Arithmetic mean = (50% + (-50%)) / 2 = 0%
Geometric mean = √(1.50 × 0.50) – 1 = √0.75 – 1 = -13.4%

If you invested $100:

After Year 1: $100 × 1.50 = $150
After Year 2: $150 × 0.50 = $75

You lost 25% of your investment, yet the arithmetic mean suggests you broke even. The geometric mean (-13.4% per year) correctly reflects the compound decline.

The gap between arithmetic and geometric mean widens as return volatility increases. This phenomenon — called volatility drag — explains why high-volatility investments often underperform their “average” returns over time.

How to Calculate Descriptive Statistics

Follow these steps to calculate key descriptive statistics for any return series:

Collect returns: Gather periodic returns (daily, monthly, annual) as percentages
Calculate arithmetic mean: Sum all returns and divide by the count
Sort and find median: Arrange returns in order; find the middle value(s)
Compute gross returns: Add 1 to each return (e.g., 5% becomes 1.05)
Calculate geometric mean: Take the nth root of the product of gross returns, then subtract 1
Calculate deviations: Subtract the arithmetic mean from each return
Square and sum deviations: Square each deviation and add them together
Compute sample variance: Divide the sum of squared deviations by (n – 1)
Compute standard deviation: Take the square root of variance
Calculate CV: Divide standard deviation by the arithmetic mean

Try the Descriptive Statistics Calculator

For portfolio-level applications of these statistics, see our guide on portfolio diversification.

Common Mistakes

Avoid these frequent errors when working with descriptive statistics in finance:

1. Using arithmetic mean for multi-period returns — The arithmetic mean overstates compound growth. Use the geometric mean when measuring how an investment actually performed over time.

2. Using population formulas for sample data — Dividing by n instead of (n – 1) underestimates variance. In finance, you’re almost always working with samples of historical returns, not complete populations.

3. Ignoring outliers — A single extreme return can significantly distort the arithmetic mean. Always check the median and look for outliers before drawing conclusions.

4. Comparing statistics across different frequencies — A monthly standard deviation of 5% is not comparable to an annual standard deviation of 15%. Convert to the same time frame before comparing.

5. Misusing coefficient of variation — CV is meaningless when the mean is near zero or negative. It’s also unreliable when comparing returns measured over different horizons.

6. Over-interpreting summary statistics from skewed data — Two distributions can have identical means and standard deviations but very different shapes. Summary statistics don’t capture skewness and kurtosis, which matter for understanding tail risk.

Limitations of Descriptive Statistics

Descriptive statistics are powerful tools, but they have important limitations:

Key Limitation

Descriptive statistics reduce a full distribution to a few numbers. Two datasets can have identical means, medians, and standard deviations but look completely different. Always visualize your data when possible — a histogram reveals patterns that summary statistics hide.

1. Backward-looking — Descriptive statistics summarize what happened, not what will happen. Past returns don’t guarantee future performance. Market regimes change, and historical statistics may not reflect current conditions.

2. No distributional shape information — Mean and standard deviation assume (or work best with) symmetric distributions. Financial returns often exhibit skewness (asymmetry) and fat tails (extreme events more frequent than normal distributions suggest). For these characteristics, see skewness and kurtosis in returns.

3. Time-ordering is lost — Descriptive statistics treat all observations equally regardless of sequence. They can’t capture momentum, mean reversion, or autocorrelation — patterns where the order of returns matters.

4. Symmetric treatment of gains and losses — Standard deviation penalizes large gains the same as large losses. Most investors care more about downside risk. For asymmetric risk measures, consider downside deviation or Value at Risk.

Bottom Line

Descriptive statistics are the essential starting point for financial analysis, but they should never be your only tool. Combine them with distributional analysis (probability distributions), visualization, and forward-looking risk measures for a complete picture.

Frequently Asked Questions

Descriptive statistics summarize and describe data you already have — calculating means, medians, and standard deviations for a specific dataset. Inferential statistics use sample data to draw conclusions about a larger population, such as testing whether a fund manager’s outperformance is statistically significant or estimating the probability that future returns fall within a certain range. In portfolio analysis, you typically start with descriptive statistics to understand historical performance, then use inferential techniques to make predictions or test hypotheses.

The geometric mean accounts for compounding, which is how investment returns actually accumulate. If you earn +20% one year and -10% the next, the arithmetic mean suggests 5% average growth, but your actual compound return is lower because the loss applies to a larger base. The geometric mean captures this reality and always equals or is less than the arithmetic mean. Use geometric mean for historical performance measurement and arithmetic mean for single-period expected returns or CAPM calculations.

The coefficient of variation (CV) is the ratio of standard deviation to the mean — it measures risk per unit of return. CV is useful when comparing investments with different expected returns or measured in different units. For example, comparing the relative riskiness of a high-return emerging market fund versus a low-return bond fund. However, CV is unreliable when the mean is near zero or negative, or when comparing returns measured over different time periods. Use it cautiously and only when means are meaningfully positive.

Compare the mean to the median. If they differ substantially, outliers are likely pulling the mean in one direction. You can also look at the interquartile range (IQR = Q3 – Q1) and flag values more than 1.5 × IQR below Q1 or above Q3 as potential outliers. In finance, extreme returns during market crashes or rallies are common outliers. Decide whether they represent data errors (remove them) or genuine market events (keep them but acknowledge their impact on your statistics).

Bessel’s correction means dividing by (n – 1) instead of n when calculating sample variance. It corrects for the tendency of samples to underestimate population variance. When you calculate the mean from a sample, you use information from that same sample, which introduces a subtle bias — the sum of deviations from the sample mean is always zero, constraining one degree of freedom. Dividing by (n – 1) adjusts for this, giving an unbiased estimate of population variance. In finance, where you’re always working with samples of historical returns, this correction is standard practice.

Dispersion measures like standard deviation and variance quantify how much returns vary — wider variation means more uncertainty and, in finance, more risk. However, descriptive statistics have limitations as risk measures: they treat upside and downside symmetrically (most investors care more about losses), they’re based on historical data that may not predict the future, and they don’t capture tail risk or black swan events well. For a complete risk assessment, supplement descriptive statistics with volatility analysis, downside risk measures, and stress testing.

Quartiles divide sorted data into four equal parts: Q1 (25th percentile), Q2 (median, 50th percentile), and Q3 (75th percentile). In finance, quartiles are used for performance benchmarking (a fund in the top quartile outperformed 75% of peers), constructing quantile portfolios for factor investing, and measuring return distribution tails. The interquartile range (Q3 – Q1) is a robust measure of spread that isn’t affected by extreme outliers. Note that different software packages may calculate quartiles slightly differently due to varying interpolation methods.

Disclaimer

This article is for educational and informational purposes only and does not constitute investment advice. Statistical measures cited are for illustrative purposes and may differ based on calculation methods, data sources, and time periods. Always conduct your own research and consult a qualified financial advisor before making investment decisions.

Explore Top Finance Certificates

Access official certificates from Wharton Online & Columbia Business School Executive Education, powered by Wall Street Prep. Save up to $500 with code RYAN.

Month	Return
Jan	+4%
Feb	-2%
Mar	+6%
Apr	+1%
May	-3%
Jun	+5%
Jul	+2%
Aug	-1%
Sep	+7%
Oct	+3%
Nov	-4%
Dec	+2%

Month	Return
Jan	+4%
Feb	-2%
Mar	+6%
Apr	+1%
May	-3%
Jun	+5%
Jul	+2%
Aug	-1%
Sep	+7%
Oct	+3%
Nov	-4%
Dec	+2%

Descriptive Statistics: Central Tendency and Dispersion

What Are Descriptive Statistics?

Measures of Central Tendency

Arithmetic Mean

Median

Mode

Geometric Mean

Measures of Dispersion

Range

Mean Absolute Deviation (MAD)

Variance

Standard Deviation

Coefficient of Variation

Sample vs Population Statistics

Population Statistics

Sample Statistics

Quantiles and Percentiles

Interpreting Descriptive Statistics

Descriptive Statistics Example

Arithmetic Mean vs Geometric Mean

Arithmetic Mean

Geometric Mean

How to Calculate Descriptive Statistics

Common Mistakes

Limitations of Descriptive Statistics

Frequently Asked Questions

What is the difference between descriptive and inferential statistics?

Why do we use geometric mean instead of arithmetic mean for investment returns?

What is coefficient of variation and when should I use it?

How do I know if my data has outliers affecting the mean?

What is Bessel’s correction and why does it matter?

Can descriptive statistics tell me about risk?

What are quartiles and how are they used in finance?

Disclaimer

Table of Contents

MarketXLS Excel Add-in

Explore Top Finance Certificates

Contact Me

Contact Me

Month	Return
Jan	+4%
Feb	-2%
Mar	+6%
Apr	+1%
May	-3%
Jun	+5%
Jul	+2%
Aug	-1%
Sep	+7%
Oct	+3%
Nov	-4%
Dec	+2%