For those of us who live at the intersection of code and capital, building our own financial models is the ultimate intellectual challenge. This series is the technical deep-dive into my journey of transforming a simplistic Python-based DCF and Magic Formula tool into a nuanced, risk-aware valuation engine.
This isn't about high-level concepts; this is for those who like to look under the hood. We'll get into the code, the logic, and the architectural decisions that turned a flawed model into a powerful analytical platform. It all started when my testing harness began to surface some deeply illogical valuations for major companies like MSFT, BABA, and DAL.
Coming up in this technical series:
- Part 1: The Diagnosis & The Test Harness: Pinpointing Failure
- Part 2: Deconstructing Lifecycle: From Simple Thresholds to Multi-Factor Classification
- Part 3: Quantifying The Unquantifiable: A Risk-Adjusted WACC Framework
- Part 4: Engineering Cyclical Mean Reversion with Hard Constraints
- Part 5: The
GrowthRateSelector: Building a Pluggable, Intelligent Selection Engine
Part 1: The Diagnosis & The Test Harness: Pinpointing Failure
Before I could fix the model, I had to prove, conclusively, where it was failing. I built a simple test harness using Python's unittest framework to lock in the problematic behavior. The goal was to create failing tests that I could later turn green.
The initial test for a company like Microsoft looked something like this:
import unittest
class TestValuation(unittest.TestCase):
def test_microsoft_valuation(self):
# Simplified for illustration
market_cap = 3.42e12
dcf_valuation = self.model.run_dcf('MSFT')
# This assertion was failing spectacularly
self.assertTrue(
abs(dcf_valuation - market_cap) / market_cap < 0.30,
f"Valuation error is too high: {dcf_valuation}"
)
The root of the failure was in the model's core logic, which was a series of simple, brittle if/else statements. The lifecycle classification, for example, was a one-liner that caused all the trouble:
# The old, flawed logic
def determine_lifecycle_stage(maturity_score):
if maturity_score >= 80:
return 'declining' # MSFT, with a score of 95, fell right into this trap
# ... other stages
Similarly, the WACC and growth calculations were naive, context-free functions. With my failing tests in hand, I had a clear target. It was time to refactor.
Part 2: Deconstructing Lifecycle: From Simple Thresholds to Multi-Factor Classification
The first and most critical fix was to completely overhaul the lifecycle classification system. The "maturity score" was a decent starting point, but it lacked the nuance to distinguish between a company that is mature and declining, and one that is mature and dominant.
The Old Logic:
# OLD: A simple, one-dimensional check
if maturity_score >= 80:
stage = 'declining'
The New Approach: Multi-Factor Classification
I expanded the classification from a single score to a multi-factor decision tree, introducing two new, crucial categories: quality_mature and mature_stable.
The logic now looks more like this, considering Return on Capital (ROC) and revenue growth as key differentiators:
# NEW: Multi-factor logic for high-maturity companies
if maturity_score >= 80:
if roc > 0.20 and revenue_growth_3yr > 0.05:
stage = 'quality_mature' # ✅ MSFT now gets this!
elif revenue_growth_3yr > 0.02:
stage = 'mature_stable'
else:
stage = 'declining'
This was coupled with an update to the growth parameters dictionary, which now provides appropriate growth and terminal rate assumptions for these new categories.
LIFECYCLE_GROWTH_PARAMETERS = {
# ... other stages
'quality_mature': {
'stage1_growth': 0.12, 'stage2_growth': 0.08, 'terminal_growth': 0.03
},
'mature_stable': {
'stage1_growth': 0.08, 'stage2_growth': 0.05, 'terminal_growth': 0.025
},
'declining': {
'stage1_growth': 0.03, 'stage2_growth': 0.02, 'terminal_growth': 0.02
}
}
This change alone turned the failing test_microsoft_valuation green. It was a powerful first step, proving that embedding more domain-specific logic into the code was the right path forward.
Part 3: Quantifying The Unquantifiable: A Risk-Adjusted WACC Framework
The absurd $1.2 trillion valuation for Alibaba was a clear sign that my model was blind to risk. The WACC calculation was a simple function that needed to be rebuilt into a comprehensive risk assessment framework.
The Old WACC: The original calculation was little more than the base CAPM rate with a small, generic quality adjustment.
The New Framework: A Sum of Premiums
The new WACC is calculated as a sum of the base rate plus a series of discrete, quantifiable risk premiums. This required creating several dictionaries to store this domain knowledge.
# Snippet of the new risk premium dictionaries
COUNTRY_RISK_PREMIUMS = {
'US': 0.000,
'China': 0.045,
'India': 0.035,
}
STRUCTURE_RISK_PREMIUMS = {
'VIE': 0.025,
'Dual-Class': 0.005,
'Standard': 0.000,
}
INDUSTRY_RISK_PREMIUMS = {
'Airlines': 0.030,
'Software—Infrastructure': 0.000,
'Consumer Staples': -0.005, # Can even be negative for defensive industries
}
The calculate_quality_adjusted_discount_rate function was then rewritten to consume this information:
def calculate_risk_adjusted_wacc(base_capm, country, industry, structure, roc, ...):
wacc = base_capm
# Additive premiums
wacc += COUNTRY_RISK_PREMIUMS.get(country, 0.0)
wacc += STRUCTURE_RISK_PREMIUMS.get(structure, 0.0)
wacc += INDUSTRY_RISK_PREMIUMS.get(industry, 0.0)
# Quality adjustments (can be negative)
wacc -= calculate_roc_adjustment(roc)
# Apply a dynamic floor based on category
min_wacc = get_minimum_wacc(country, structure)
return max(wacc, min_wacc)
For Alibaba, the calculation went from a naive 7.0% to a much more realistic 14.2% (7.0% Base + 4.5% China Risk + 2.5% VIE Risk + ...). This technical change had a profound impact on the final valuation, correcting the +203% error to a much more defensible ±10%.
Part 4: Engineering Cyclical Mean Reversion with Hard Constraints
The final major logical flaw was the model's inability to distinguish a temporary, cyclical rebound from sustainable, long-term growth. The +84% overvaluation of Delta Air Lines was the prime example. The solution was to engineer the concept of mean reversion directly into the growth calculation using hard constraints.
This was implemented through two key data structures:
# A list of industries known to be cyclical
CYCLICAL_INDUSTRIES = [
'Airlines', 'Hotels & Resorts', 'Oil & Gas Exploration', 'Steel', 'Auto Manufacturers',
]
# A dictionary of hard growth caps for these industries
INDUSTRY_GROWTH_CAPS = {
'Airlines': 0.05, # 5% max growth
'Hotels & Resorts': 0.05,
'Oil & Gas Integrated': 0.04,
'Auto Manufacturers': 0.05,
}
The growth calculation logic was then updated to use these constraints as a final, non-negotiable check.
def calculate_final_growth(base_growth, industry):
final_growth = base_growth
# Apply the hard cap if one exists for the industry
industry_cap = INDUSTRY_GROWTH_CAPS.get(industry)
if industry_cap is not None:
if final_growth > industry_cap:
# The cap overrides all other calculations
final_growth = industry_cap
print(f"⚠️ CYCLICAL CAP APPLIED: Growth capped at {industry_cap:.1%}")
return final_growth
For Delta, a calculated growth rate of 12% (based on the post-pandemic recovery) was immediately capped at 5%. This simple, rules-based approach effectively encodes the economic reality of the airline industry into the model, preventing the fatal error of extrapolation.
Part 5: The GrowthRateSelector: Building a Pluggable, Intelligent Selection Engine
With the core valuation components fixed, I turned my attention to architecture. The various pieces of logic were scattered across different functions. I decided to consolidate them into a single, cohesive, and extensible "brain" – the GrowthRateSelector class.
This class acts as a pluggable engine that takes in all the raw data and returns a single, well-reasoned growth estimate, complete with a confidence score.
class GrowthRateSelector:
def __init__(self, lifecycle_params, quality_params, weighting_params):
self.lifecycle_params = lifecycle_params
self.quality_params = quality_params
self.weighting_params = weighting_params
def select_growth_rate(self, company_data):
# 1. Determine lifecycle stage (from Part 2)
stage = self._determine_lifecycle(company_data)
# 2. Get lifecycle-dependent input weights
weights = self.weighting_params[stage]
base_growth = (
company_data['historical_growth'] * weights['historical'] +
company_data['forward_guidance'] * weights['forward'] +
company_data['analyst_estimates'] * weights['analyst']
)
# 3. Apply quality premium based on ROC
base_growth *= self._calculate_quality_premium(company_data['roc'])
# 4. Apply cyclical cap (from Part 4)
final_growth = self._apply_cyclical_cap(base_growth, company_data['industry'])
# 5. Calculate confidence score
confidence = self._calculate_confidence(stage, company_data)
return {'growth': final_growth, 'confidence': confidence}
The weighting_params dictionary is particularly interesting, as it encodes the strategy for how much to trust different sources of information for different types of companies:
WEIGHTING_PARAMS = {
'quality_mature': {'historical': 0.15, 'forward': 0.60, 'analyst': 0.25},
'growth': {'historical': 0.30, 'forward': 0.40, 'analyst': 0.30},
'cyclical': {'historical': 0.15, 'forward': 0.60, 'analyst': 0.25}, # Discount history
}
Refactoring the logic into this class made the entire system more modular, testable, and, most importantly, extensible. Adding a new rule or a new input signal is now a matter of modifying the selector, not rewriting the entire codebase.
Conclusion: The Model as a Platform & An Invitation to Collaborate
This journey has been about more than just fixing bugs. It's been about transforming a collection of scripts into a robust, rules-based valuation engine. The model is now less of a black box and more of a transparent platform for codifying investment logic.
But no model is ever "finished." This is where the real fun begins. I'm sharing this technical journey on Substack because I believe the most interesting problems in this space are solved collaboratively. This model is a platform for thought, and I'd love to hear your thoughts on how to extend it.
Here are some of the questions I'm currently grappling with:
- Modeling "Moat": How would you quantitatively distinguish between a wide-moat company and a no-moat company? Could we create a "moat score" based on ROC stability, market share, and gross margin trends, and use it to adjust the terminal growth rate?
- Dynamic Risk Premiums: My country risk premiums are currently static. How could they be made dynamic? Should they be linked to sovereign bond yields, currency volatility, or a political risk index?
- Smarter Cyclical Detection: My current cyclical detection relies on a static list of industries. What are your thoughts on a more dynamic approach? Could we use FCF volatility, correlation to GDP, or even textual analysis of earnings calls to classify a company as cyclical?
- Management Quality: This is the holy grail of qualitative-to-quantitative conversion. How could we score management quality? Capital allocation track record (ROIC vs. WACC), insider ownership, or shareholder-friendly actions (buybacks vs. dilutive acquisitions)?
If these are the kinds of problems that keep you up at night, I'd love to hear your ideas in the comments. Let's build better models together.
