Benchmark Selection: 7 Properties of a Valid Benchmark
Choosing the wrong benchmark is one of the most consequential — and most common — errors in portfolio management. A flawed benchmark invalidates every performance metric built on top of it: attribution analysis, manager hiring and firing decisions, and incentive compensation all depend on the benchmark being appropriate. This guide covers the seven properties every valid benchmark must satisfy, the types of benchmarks available, how to select the right one, and how to test benchmark quality over time.
What Is a Benchmark in Portfolio Management?
A benchmark is a reference portfolio that represents the investment opportunity cost — the return an investor could have earned by passively holding a representative set of securities instead of paying for active management. Benchmarks serve multiple critical roles: measuring performance, defining risk budgets, communicating investment mandates to clients, and providing the foundation for performance attribution.
A valid benchmark is more than just a popular index. It must be appropriate for the manager’s specific investment strategy and satisfy rigorous quality standards. Using the S&P 500 as the benchmark for every portfolio — regardless of style, geography, or asset class — is one of the most common mistakes in performance evaluation.
Without a valid benchmark, superior performance remains an elusive notion. A 12% annual return means nothing until you know what the appropriate reference point earned over the same period. The benchmark transforms raw returns into meaningful, actionable information.
7 Properties of a Valid Benchmark
The CFA Institute’s performance evaluation framework identifies seven properties that every valid benchmark must possess. A benchmark that fails any of these properties compromises its usefulness as an investment management tool.
1. Unambiguous — The identities and weights of the securities (or factor exposures) constituting the benchmark are clearly defined and publicly available. There should be no question about what the benchmark contains at any point in time.
2. Investable — It is possible to forgo active management and simply hold the benchmark. An absolute return target like “CPI + 5%” fails this property because there is no passive portfolio that reliably delivers that return.
3. Measurable — The benchmark’s return can be calculated on a reasonably frequent basis, ideally daily or monthly. A peer group median is measurable only after the fact, and even then it is subject to revision as accounts are added or removed.
4. Appropriate — The benchmark is consistent with the manager’s investment style and investable universe. The S&P 500 fails this property for a small-cap value manager or an international equity fund because it doesn’t represent the opportunity set those managers draw from.
5. Reflective of Current Investment Opinions — The manager has current knowledge of the securities in the benchmark and has had the opportunity to express a view on them. This ensures the benchmark represents the manager’s actual opportunity set, not a hypothetical one.
6. Specified in Advance — The benchmark is determined before the start of the evaluation period, not selected retroactively. Manager universes (peer groups) violate this property because their composition changes over time as accounts are added and removed.
7. Owned — The manager accepts the benchmark as a fair representation of their investment process and accepts accountability for deviations from it. “Owned” means the benchmark is accepted and embedded in the manager’s investment process — not merely imposed by the sponsor.
“Owned” is the property most often violated in practice. When a manager does not accept the benchmark as a fair representation of their process, performance evaluation becomes adversarial rather than constructive — and the resulting attribution analysis loses its diagnostic value.
Types of Investment Benchmarks
There are seven primary types of benchmarks in use across the investment industry. Each has distinct strengths and limitations, and no single type is universally appropriate.
| Type | Description | Key Limitation |
|---|---|---|
| Absolute Return | A fixed return target (e.g., “CPI + 5%” or “8% annually”) | Not investable — no passive portfolio reliably delivers the target |
| Manager Universes | Peer group ranking (e.g., Morningstar category median) | Not investable, not specified in advance, survivorship bias |
| Broad Market Indexes | S&P 500, MSCI World, Bloomberg Aggregate Bond | May not match style-specific strategies; valid when mandate is broad beta |
| Style Indexes | Russell 1000 Growth, Russell 2000 Value, MSCI Value | Style definitions vary across providers (Russell vs. MSCI vs. S&P) |
| Factor-Model-Based | Benchmark constructed from systematic factor exposures (size, value, momentum) | Model-dependent and potentially non-investable; results change with factor specification |
| Returns-Based | Statistical regression to infer style exposures from past returns | Backward-looking and indirect; sensitive to estimation window |
| Custom Security-Based | Built from the manager’s typical holdings and investable universe | Expensive to construct and maintain; considered the gold standard |
For individual investors building a diversified portfolio, a blended total-return benchmark matched to your allocation weights is often the most practical choice — for example, 60% S&P 500 Total Return + 40% Bloomberg U.S. Aggregate Bond Total Return for a classic 60/40 portfolio. This approach is investable (via index ETFs), measurable, appropriate, and specified in advance.
How to Choose a Benchmark
Selecting an appropriate benchmark is a structured process, not a default to whatever index is most familiar. Follow these steps to ensure your benchmark meets the quality standards required for meaningful performance evaluation:
- Define the investment mandate — What is the manager’s investable universe, style orientation, and geographic scope? A U.S. small-cap value manager has a fundamentally different opportunity set than a global balanced fund.
- Identify candidate benchmarks — Consider broad market indexes, style indexes, or custom/blended alternatives based on the mandate definition.
- Test against the 7 properties — Does each candidate satisfy all seven properties? Eliminate any that fail critical properties like investable, appropriate, or specified in advance.
- Run quality tests — Validate the chosen benchmark quantitatively using the six heuristic tests described in the next section.
- Agree and document — Both the manager and sponsor must accept the benchmark. Embed it in the investment policy statement before the evaluation period begins.
Building a Custom or Blended Benchmark
When no single published index matches the investment mandate, construct a custom or blended benchmark using this process:
- Identify the prominent aspects of the manager’s investment process
- Select securities consistent with that process
- Devise a weighting scheme, including a neutral cash position
- Review the preliminary benchmark and make modifications
- Rebalance the benchmark on a predetermined schedule
A diversified individual investor with a target allocation of 60% U.S. equity, 25% international equity, and 15% bonds constructs a blended benchmark:
| Asset Class | Weight | Index |
|---|---|---|
| U.S. Equity | 60% | S&P 500 Total Return |
| International Equity | 25% | MSCI EAFE Total Return |
| U.S. Bonds | 15% | Bloomberg U.S. Aggregate Bond Total Return |
This blended benchmark is investable (via low-cost index ETFs), measurable, appropriate for the portfolio’s allocation, and specified in advance — satisfying the core properties of a valid benchmark in a practical, cost-effective way.
For more on how indexes are constructed and how weighting methods work, see our guide to stock market indexes.
Custom Benchmarks vs. Broad Market Indexes
The choice between a custom benchmark and a broad market index depends on the investment mandate, the evaluation stakes, and the resources available for benchmark construction.
Custom Security-Based Benchmark
- Satisfies all 7 properties of a valid benchmark
- Tailored to the manager’s specific investable universe
- Gold standard for institutional performance evaluation
- Expensive to construct and maintain
- Best for: large mandates with high-stakes evaluation
Broad Market Index
- Cheap, widely available, and easy to understand
- Fully investable via low-cost ETFs and index funds
- Valid when the mandate is broad market exposure
- May not match style-specific or concentrated strategies
- Best for: core allocations and broad beta mandates
Broad market indexes are not inherently inferior — they are the correct benchmark when the mandate itself is broad beta exposure. A total stock market index fund should be benchmarked to a total market index, not a custom benchmark.
T. Rowe Price Blue Chip Growth Fund (TRBCX) uses the Russell 1000 Growth Index as its primary benchmark and reports the S&P 500 as a secondary comparator. Comparing the fund’s returns only to the S&P 500 — a blend index — would overstate the manager’s alpha in periods when growth stocks outperform value stocks. The style-appropriate Russell 1000 Growth benchmark provides a more accurate picture of the manager’s actual stock selection skill versus the style return.
CalPERS, the largest U.S. public pension fund, constructs custom benchmarks across its public equity, fixed income, and private asset programs. Each benchmark reflects the specific opportunity set and constraints of that allocation — because no single published index captures a private equity or real estate investment mandate.
How to Test Benchmark Quality
Even after selecting a benchmark, validate it over time using these six heuristic tests from the CFA performance evaluation framework. A benchmark that fails multiple tests should be reconsidered.
1. Systematic Biases — Does the manager consistently outperform or underperform the benchmark regardless of market conditions? Specifically, the regression beta of account returns versus benchmark returns should be close to 1.0 with high correlation. Persistent bias or a beta significantly different from 1.0 suggests the benchmark does not match the manager’s actual investment process.
2. Tracking Error — Is the active risk (standard deviation of active returns) reasonable? Very high tracking error indicates the benchmark does not capture the manager’s investment universe. Compare tracking error against multiple candidate benchmarks to find the best fit.
3. Risk Characteristics — Do the benchmark’s factor exposures (market capitalization, sector weights, style tilt) approximate those of the portfolio? Large mismatches in factor exposures signal an inappropriate benchmark.
4. Coverage — What proportion of portfolio holdings are also in the benchmark? Low coverage — where the manager frequently holds securities outside the benchmark — signals a fundamental mismatch between the benchmark and the investment process.
5. Turnover — Is the benchmark’s turnover low enough that it remains a feasible passive alternative? High benchmark turnover increases transaction costs for passive replication and may indicate the benchmark is too actively constructed to serve as a valid reference point.
6. Positive Active Positions — For a long-only manager, a well-constructed benchmark should produce largely positive active positions (overweights in securities the manager favors). A high share of negative active positions — where the benchmark contains many securities the manager does not hold — indicates the benchmark is poorly representative of the manager’s actual investment universe.
A U.S. large-cap growth manager reports the following tracking error against two candidate benchmarks:
| Candidate Benchmark | Annualized Tracking Error | Coverage Ratio |
|---|---|---|
| S&P 500 | 8.0% | 62% |
| Russell 1000 Growth | 3.1% | 89% |
The Russell 1000 Growth Index shows lower tracking error and higher coverage, indicating it better captures the manager’s opportunity set. The S&P 500’s high tracking error and low coverage confirm it is not the appropriate benchmark for this growth-oriented strategy.
For more on how stock market indexes are constructed and maintained, see our dedicated guide.
Common Mistakes in Benchmark Selection
Even experienced investors and fund sponsors make these benchmark selection errors, each of which can lead to flawed performance conclusions:
1. Using the S&P 500 for Everything — A small-cap value fund or international equity strategy benchmarked to the S&P 500 conflates style and size return with manager alpha. The manager may appear skilled simply because their style outperformed large-cap blend.
2. Selecting Benchmarks Retroactively — Cherry-picking the benchmark that makes performance look best after the evaluation period violates the “specified in advance” property. This undermines the entire purpose of performance evaluation.
3. Using Peer Group Medians as Primary Benchmarks — Manager universes suffer from survivorship bias (poorly performing funds drop out), are not investable, and are not specified in advance. They can be useful as a secondary reference but fail as a primary benchmark. See our guide on active vs. passive investing for more context.
4. Using a Price Index Instead of a Total Return Index — Benchmarking to a price-return index (which excludes dividends) creates artificial alpha because the portfolio collects dividends that the benchmark does not reflect. Always use total return benchmarks that include reinvested dividends.
5. Benchmarking a Multi-Asset Portfolio to a Single Equity Index — A 60/40 portfolio measured against the S&P 500 alone distorts both risk and return attribution. The equity-only benchmark makes the portfolio look less volatile and less return-generating than it actually is relative to its true opportunity set.
Limitations of Benchmark-Based Evaluation
While benchmarks are essential for performance evaluation, they are inherently imperfect tools:
No benchmark perfectly captures a manager’s investment process. All benchmark-based evaluation involves a trade-off between precision and practicality. The goal is to select the most appropriate benchmark available — not to find a perfect one.
Benchmark Gaming — Managers may “hug” the benchmark to reduce tracking error and protect their information ratio, sacrificing genuine alpha generation in favor of career safety. This closet indexing delivers active management fees for passive-like returns.
Style Drift — If a manager’s investment style evolves over time, a previously appropriate benchmark may become inappropriate. Periodic benchmark reviews are necessary to detect and address drift.
Alternative Assets — Private equity, real estate, and hedge funds lack truly investable benchmarks, making the seven-property framework difficult to apply fully. Custom benchmarks for alternatives are inherently imprecise.
Liability-Based Needs — For pension funds and insurance companies, the correct benchmark may be liability-relative (tied to the present value of future obligations) rather than asset-index-relative. Standard market indexes may not capture the risk that matters most to these investors.
Benchmark selection is not a one-time decision. Re-evaluate benchmark appropriateness at least annually as the manager’s style, market structure, and available indexes evolve. A benchmark that was appropriate three years ago may no longer be the right reference point today. Document any changes in the investment policy statement.
Frequently Asked Questions
Disclaimer
This article is for educational and informational purposes only and does not constitute investment advice. Benchmark properties and quality tests described are based on the CFA Institute’s performance evaluation framework. Specific fund and index examples are illustrative and may not reflect current performance. Always conduct your own research and consult a qualified financial advisor before making investment decisions.