← Back to builder

The Methodology

How Direct Index Club estimates correlation and calculates fee savings — and the research behind it.

Read the full analysis on Substack
Deep dive: the compounding cost of fees, the research on portfolio replication, and how to build your first basket.
Read on Substack →

1. The Compounding Cost of Fees

Expense ratios are deducted daily from a fund's NAV — silently, invisibly. They don't appear on your brokerage statement as a line item. You just end up with less.

The compounding effect makes this painful over time. Every dollar paid in fees in year 1 is a dollar that can no longer compound. By year 30 (at 7% growth), that dollar would have become $7.61. You're not just losing the fee — you're losing everything it would have grown into.

$100,000 invested at 7% annual return — value after N years
Expense RatioExample10yr20yr30yrLost vs DIY (30yr)
0.00%DIY Basket$196,715$386,968$761,226
0.03%VOO$196,164$384,804$754,849-$6,377
0.09%SPY$195,067$380,510$742,249-$18,976
0.10%XLV / XLE$194,884$379,799$740,169-$21,056
0.20%QQQ$193,069$372,756$719,677-$41,549
0.40%ITA (Defence)$189,484$359,041$680,325-$80,901
0.60%HACK (Cyber)$185,959$345,806$643,056-$118,169
0.75%ARKK$183,354$336,185$616,408-$144,818

Assumes 7% annual return before fees. Fee drag is modelled as (1.07 − expense_ratio)^n.

2. How Many Stocks Do You Actually Need?

The common assumption is that ETFs provide indispensable diversification you can't replicate. The research tells a more nuanced story.

Evans & Archer (1968)

The first major study to quantify diversification. Found that idiosyncratic (company-specific) risk drops sharply after just 10–15 randomly selected stocks. By 20 stocks, most unsystematic risk has been eliminated. The marginal benefit of each additional stock diminishes rapidly.

Fisher & Lorie (1970)

Studied the variability of returns across different portfolio sizes. Found that a randomly constructed portfolio of 32 stocks captures 95% of the variance reduction achievable with the full NYSE. This became the origin of the "30 stocks = diversified" rule of thumb.

Optimised Sampling (modern ETF practice)

Here's the key insight: Vanguard and iShares don't buy all 500 stocks in their S&P 500 funds. They use optimised sampling — picking a subset that represents the key sector, size, and style characteristics of the index. The reason is purely mathematical: it works.

3. The Weight Concentration Effect

In market-cap weighted indexes, weight is heavily concentrated in the top holdings. For a cap-weighted ETF:

Top 10 holdingsSPY: ~35%ITA: ~80%
Top 15 holdingsSPY: ~45%ITA: ~90%+
Bottom halfSPY: ~10%ITA: ~5%

This concentration means the bottom half of an ETF's holdings has minimal impact on returns. Owning the top 10–20 stocks, cap-weighted, gives you the lion's share of the ETF's return characteristics — at a fraction of the complexity.

4. How We Estimate R²

R² (coefficient of determination) measures how closely your basket tracks the ETF. An R² of 0.97 means 97% of the ETF's return variance is explained by your basket.

// R² estimation formula
coverage = selectedWeight / totalHoldingsWeight
R² = min(0.50 + coverage × 0.49, 0.99)

This is a conservative heuristic. The 0.50 baseline reflects that any portfolio of the right sector will have significant correlation to the ETF. The coverage multiplier captures the incremental tracking improvement from including higher-weight holdings. Real-world R² values are typically higher than our estimates — our tool is intentionally conservative.

5. Practical Considerations

🔄
Rebalancing

ETFs rebalance automatically. Your basket drifts. Review quarterly and trim/add positions to maintain target weights.

💸
Tax efficiency

You control when you realise gains and losses — useful for tax-loss harvesting. More transactions to track, but more flexibility.

📏
Minimum size

With fractional shares, you can start small. But fee savings become meaningful at $25K+ due to the low absolute dollar amounts at smaller sizes.

⚠️
Concentration risk

A 15-stock basket has more idiosyncratic risk than a 500-stock fund. You are accepting slightly more single-stock risk in exchange for zero fees.

📝

The full paper is on Substack

Deep dive into the compounding maths, the academic research, real-world examples with the defence and cybersecurity sectors, and how to build your first basket step by step.

Read on Substack →

For educational purposes only. Not financial advice. Past performance does not guarantee future results.