Theory: From Stability to Fragility

Mathematical and conceptual reference for the Minsky Market Simulation.

Minsky’s Financial Instability Hypothesis
Agent-Based Modelling in Financial Markets
Fundamental Value Process
Price Formation Mechanism
Agent Balance Sheets
Leverage, Debt, and Margin Calls
Agent Decision Models
Reinforcement Learning Formulation
Market Instability Metrics
The Minsky Mechanism End-to-End
Limitations and Modelling Assumptions
Further Reading

1. Minsky’s Financial Instability Hypothesis

1.1 Background

Hyman Minsky (1919–1996) was an American economist whose work, largely ignored during his lifetime, became influential after the 2008 global financial crisis. His central claim, the Financial Instability Hypothesis (FIH), inverts the standard equilibrium view of financial markets: rather than markets being self-correcting, stability itself sows the seeds of instability.

The core insight is that during periods of calm, agents systematically underestimate risk and increase leverage. This collective shift in risk-taking transforms a stable system into a fragile one. When a shock eventually arrives: even a small one: the over-leveraged system is unable to absorb it, and cascading forced selling produces a crash far larger than the original shock.

1.2 The Three Financing Regimes

Minsky classified borrowers into three regimes based on their ability to service debt from income:

Hedge finance: the borrower can cover both interest payments and principal repayment from expected cash flows. In the simulation, proxied by low leverage: $L < 1.5$

Speculative finance: the borrower can cover interest payments but cannot repay principal without rolling over the debt. Proxied by moderate leverage: $1.5 \leq L < 3.0$

Ponzi finance: the borrower cannot even cover interest payments from cash flows. Solvency depends entirely on rising asset prices. Proxied by high leverage: $L \geq 3.0$

1.3 Stability Breeds Instability

The critical dynamic is the endogenous transition between regimes. In a calm market: observed volatility is low, recent returns have been positive, defaults are rare, and lenders extend credit at low rates. Rational agents respond by increasing leverage to capture higher returns. Individually defensible; collectively, the shift from hedge to speculative to Ponzi finance makes the entire system fragile.

The feedback loop:

Low volatility
    → agents increase leverage
    → leveraged buying pushes prices up
    → rising prices validate risk-taking
    → measured volatility stays low (prices trend smoothly upward)
    → agents increase leverage further
    → ...

1.4 The Minsky Moment

A Minsky moment is the point at which the system tips from fragility into crisis. A small negative shock triggers margin calls for Ponzi-financed agents. Forced selling drives prices down. Falling prices trigger margin calls for speculative agents. Further forced selling drives prices down further. The cascade continues until leverage across the system is dramatically reduced, often overshooting fundamental value downward.

The key property: the crash is disproportionate to the trigger. A 2% price fall from a shock can produce a 30% crash through the forced deleveraging spiral.

2. Agent-Based Modelling in Financial Markets

2.1 Why Agent-Based Models

Standard financial models assume homogeneous, fully rational agents, equilibrium pricing, normally distributed returns, and no feedback between agent actions and prices. Real markets exhibit fat-tailed return distributions, volatility clustering, asset price bubbles and crashes, and heterogeneous beliefs.

Agent-based models (ABMs) replace the representative rational agent with a population of heterogeneous, adaptive agents. Each agent follows a simple local rule. Macro-level phenomena: bubbles, crashes, volatility clustering: emerge from micro-level interactions rather than being imposed by assumption.

2.2 Emergence

In complex systems, emergence refers to properties of the whole that cannot be predicted from the properties of any individual part. In the simulation: no single agent is programmed to cause a crash. The crash emerges from the interaction of individual leverage decisions, price impact, and the margin call mechanism. This is what makes the model a genuine test of Minsky’s hypothesis: instability is not hard-coded, it must arise endogenously.

2.3 Heterogeneous Agents

The simulation uses four agent archetypes, each embodying a different belief about how prices work:

Agent	Belief	Market role
Fundamental	Price reverts to intrinsic value	Stabilising
Momentum	Trends persist	Destabilising
Noise	No systematic view	Provides liquidity and randomness
RL	Learns from reward signal	Adaptive, potentially destabilising

3. Fundamental Value Process

The fundamental value $F_t$ represents the true intrinsic worth of the asset, evolving independently of market prices. It follows a discrete-time geometric Brownian motion:

F_{t+1} = F_t \left(1 + \mu_F + \varepsilon_t\right), \quad \varepsilon_t \sim \mathcal{N}(0, \sigma_F^2)

where $\mu_F$ is the drift (set to 0 in baseline experiments) and $\sigma_F$ is the fundamental volatility (kept small, e.g. 0.003 per step).

The key design choice is keeping $\sigma_F$ small so that large price movements cannot be explained by fundamental shocks alone: any crash must be endogenously generated.

In continuous time, the equivalent process is:

dF = \mu_F F \, dt + \sigma_F F \, dW_t

where $W_t$ is a standard Wiener process.

4. Price Formation Mechanism

4.1 Net Demand

At each time step, agents submit buy and sell orders. Aggregating across all $N$ active agents:

D_t = D_t^+ - D_t^-, \qquad \tilde{D}_t = \frac{D_t}{N_{\text{active}}}

Normalising by active agent count prevents price explosions when many agents submit large simultaneous orders.

4.2 Price Impact Function

The price updates via a log-linear impact function:

P_{t+1} = P_t \cdot \exp\!\left(\frac{\alpha \tilde{D}_t}{\lambda} + \eta_t\right), \quad \eta_t \sim \mathcal{N}(0, \sigma_\eta^2)

where $\alpha$ is the price impact coefficient, $\lambda$ is the liquidity parameter, and $\eta_t$ is i.i.d. Gaussian market noise.

The exponential form ensures $P_t > 0$ always, and is consistent with the Kyle (1985) framework for price impact in the presence of informed traders.

4.3 Forced-Sell Feedback

When margin calls fire in step $t$ , the resulting forced sell volume $\Phi_t$ is buffered and injected as additional sell demand in step $t+1$ :

D_{t+1} \leftarrow D_{t+1} - \Phi_t

This one-step delay models realistic broker execution lag, and allows the price decline from forced selling to propagate and trigger further margin calls: the mechanism behind the Minsky cascade.

5. Agent Balance Sheets

Each agent $i$ holds cash $c_{i,t}$ , shares $q_{i,t}$ , and debt $d_{i,t}$ .

Asset exposure:

E_{i,t} = P_t \cdot q_{i,t}

Wealth (equity):

W_{i,t} = c_{i,t} + P_t q_{i,t} - d_{i,t}

Leverage:

L_{i,t} = \frac{|E_{i,t}|}{\max(W_{i,t},\, \varepsilon)}

where $\varepsilon > 0$ prevents division by zero. $L = 1$ means fully invested with no borrowing; $L = 2$ means the agent has borrowed an amount equal to their equity.

5.1 Trade Mechanics

Buying $\Delta q > 0$ shares at price $P_t$ : if cost $\leq c_{i,t}$ , deduct from cash. Otherwise use all cash and borrow the shortfall ( $d_{i,t}$ increases).

Selling $\Delta q > 0$ shares: proceeds first repay debt.

\text{repay} = \min(d_{i,t},\, \text{proceeds})

Note: selling shares does not increase wealth directly: proceeds go to debt repayment first. This is key: when prices fall, leveraged agents cannot easily restore their balance sheet by selling, because proceeds are absorbed by debt.

6. Leverage, Debt, and Margin Calls

6.1 Interest Accrual

At the start of each time step, debt compounds at the per-step borrowing rate $r_b$ :

d_{i,t+1} = d_{i,t} \cdot (1 + r_b)

For $r_b = 0.0003$ , the annualised equivalent is approximately 16%: penalising prolonged heavy leverage.

6.2 Forced Liquidation

When leverage exceeds $L_{\max}$ , the agent must sell enough shares to bring leverage back to $L_{\max}$ . The target share count is:

q^* = \frac{L_{\max} \cdot W_{i,t}}{P_t}

Crucially, selling to reduce leverage does not directly increase wealth: it just reduces the size of both sides of the balance sheet. Wealth only recovers if prices subsequently rise.

Default occurs when wealth is non-positive after full liquidation:

W_{i,t} = c_{i,t} - d_{i,t} \leq 0 \implies \text{default}

6.3 Why Forced Selling Creates a Cascade

The feedback loop operates through three channels:

Price channel: forced selling pushes $P_{t+1}$ down
Wealth channel: falling prices reduce $W_{i,t}$ , increasing leverage for all leveraged agents
Contagion channel: rising leverage triggers margin calls for previously safe agents

Formally, for $L_i > 1$ (leveraged), a price drop $\Delta P < 0$ causes leverage to increase:

\Delta L_i \approx \frac{q_i \Delta P}{W_i} \left(1 - L_i\right) > 0 \quad \text{when } L_i > 1,\; \Delta P < 0

This is the mathematical core of the Minsky instability mechanism.

7. Agent Decision Models

7.1 Fundamental Trader

The mispricing signal is:

s_{i,t} = \frac{F_t - P_t}{F_t}

Desired position: $q_{i,t}^* = \kappa_F \cdot s_{i,t} \cdot q_{\max}$

Fundamental traders are stabilising: they provide a force pulling prices toward $F_t$ . Without them, prices have no anchor.

7.2 Momentum Trader

The signal is the rolling return over the last $k$ steps:

m_t = \frac{P_t - P_{t-k}}{P_{t-k}}

Desired position: $q_{i,t}^* = \kappa_M \cdot m_t \cdot q_{\max}$

Momentum traders are destabilising: they amplify price moves. A price rise increases $m_t$ , generating more buying, which pushes prices higher: a positive feedback loop.

The tension between fundamental and momentum traders is what makes the simulation interesting. With only fundamental traders, prices always converge to $F_t$ . With only momentum traders, prices diverge. The mixture produces realistic boom-bust dynamics.

7.3 Noise Trader

Noise traders draw an action uniformly from {buy, sell, hold} with random order size. They prevent trivial equilibrium and provide realistic background market activity.

8. Reinforcement Learning Formulation

8.1 The Markov Decision Process

The RL agent’s interaction with the market is formalised as an MDP $(\mathcal{S}, \mathcal{A}, \mathcal{T}, \mathcal{R}, \gamma)$ .

8.2 State Space

The observation vector $\mathbf{o}_t \in \mathbb{R}^{11}$ :

\mathbf{o}_t = \left[\frac{P_t}{F_t},\; r_t,\; \sigma_t^{\text{roll}},\; m_t,\; c_{i,t},\; q_{i,t},\; d_{i,t},\; L_{i,t},\; \bar{L}_t,\; n_t^{\text{MC}},\; \text{DD}_t \right]

Feature	Description
$P_t / F_t$	Price-to-fundamental ratio
$r_t$	Last-step log return
$\sigma_t^{\text{roll}}$	Rolling volatility (20-step)
$m_t$	Rolling momentum (10-step)
$c_{i,t}, q_{i,t}, d_{i,t}$	Agent’s own cash, shares, debt
$L_{i,t}$	Agent’s own leverage
$\bar{L}_t$	Market-wide average leverage
$n_t^{\text{MC}}$	Recent margin calls
$\text{DD}_t$	Current drawdown from peak

8.3 Action Space

The agent chooses a target exposure level (multiple of current wealth):

Action	Target leverage
0	0× (full cash)
1	0.5×
2	1× (unleveraged)
3	1.5×
4	2×
5	3×

8.4 Reward Functions

Three reward functions are compared:

Profit-only: $R_t^{\text{profit}} = \Delta W_t / W_t$

Risk-adjusted: $R_t^{\text{risk}} = \Delta W_t - \lambda \cdot \text{DD}_t - \mu \cdot L_{i,t}$

System-aware: $R_t^{\text{sys}} = \Delta W_t - \lambda \cdot \text{DD}_t - \mu \cdot L_{i,t} - \gamma \cdot \bar{L}_t$

The system-aware reward tests whether an agent can be incentivised to internalise systemic risk: the negative externality their leverage imposes on others.

8.5 Deep Q-Network (DQN)

DQN (Mnih et al., 2015) approximates the Q-function with a neural network $Q_\theta(s, a)$ . The loss is:

\mathcal{L}(\theta) = \mathbb{E}_{(s,a,r,s') \sim \mathcal{D}} \left[\left(y - Q_\theta(s, a)\right)^2\right], \qquad y = r + \gamma \max_{a'} Q_{\theta^-}(s', a')

Two stabilisation techniques are essential:

Experience replay: break temporal correlations in gradient updates
Target network: frozen weights $\theta^-$ updated every $C$ steps, preventing instability

9. Market Instability Metrics

Rolling volatility: Standard deviation of log returns over a window $W$ .

Mispricing: $M_t = (P_t - F_t) / F_t$ . Positive: overvaluation. Negative: post-crash undershoot.

Maximum drawdown: $\text{MDD}_t = \min_{s \leq t} (P_s - \max_{u \leq s} P_u) / \max_{u \leq s} P_u$

Sharpe ratio: $\text{SR}_i = (\bar{r}_i / \sigma_{r_i}) \cdot \sqrt{N_{\text{steps per year}}}$

Note: Sharpe is unreliable for bimodal return distributions: precisely the conditions Minsky’s theory predicts.

Gini coefficient:

G = \frac{2 \sum_{i=1}^{N} i W_i}{N \sum_{i=1}^{N} W_i} - \frac{N+1}{N}

High-crash regimes can show low Gini because everyone is equally ruined: the model distinguishes prosperous equality from catastrophic equality.

10. The Minsky Mechanism End-to-End

Phase 1: Stable equilibrium (steps 0–100): $P_t \approx F_t$ , all agents hedge-financed, low volatility.

Phase 2: Leverage build-up (steps 100–300): Momentum traders and RL agent increase positions. Leveraged buying pushes $P_t > F_t$ . Rising prices confirm momentum signals. Speculative-finance agents multiply.

Phase 3: Fragility (steps 300–350): Price significantly above fundamental. High proportion of agents in speculative/Ponzi regimes. System is fragile: margin calls would cascade from any small fall.

Phase 4: Shock and cascade (step 350): Small negative shock triggers first margin calls. Forced selling propagates. Price falls disproportionate to the initial shock.

Phase 5: Post-crash deleveraging (steps 350–500): Surviving agents hold mostly cash. Price undershoots $F_t$ . Fundamental traders gradually push price back. The cycle can begin again.

11. Limitations and Modelling Assumptions

Assumption	What is omitted
Single risky asset	No cross-asset contagion or portfolio diversification
No order book	No bid-ask spread, no market depth
No banking sector	No credit contraction, no central bank
No collateral chains	Repo markets and rehypothecation absent
Hard leverage cap	Real margin requirements are dynamic
No short-selling	Agents cannot profit from declining prices
i.i.d. shocks	Real macro shocks are regime-dependent with fat tails

These limitations restrict generalisability but do not invalidate the mechanism: the model provides evidence of a possible channel, not proof that real RL trading systems cause crashes.

12. Further Reading

Minsky’s Financial Instability Hypothesis

Minsky, H.P. (1986). Stabilizing an Unstable Economy. Yale University Press.
Kindleberger, C.P. & Aliber, R.Z. (2005). Manias, Panics, and Crashes. Palgrave Macmillan.

Agent-Based Models in Finance

LeBaron, B. (2006). “Agent-based Computational Finance.” Handbook of Computational Economics, Vol. 2.
Farmer, J.D. & Foley, D. (2009). “The economy needs agent-based modelling.” Nature, 460, 685–686.

Price Impact and Market Microstructure

Kyle, A.S. (1985). “Continuous auctions and insider trading.” Econometrica, 53(6), 1315–1335.
Geanakoplos, J. (2010). “The Leverage Cycle.” NBER Macroeconomics Annual, 24, 1–65.

Reinforcement Learning

Sutton, R.S. & Barto, A.G. (2018). Reinforcement Learning: An Introduction (2nd ed.). MIT Press.
Mnih, V. et al. (2015). “Human-level control through deep reinforcement learning.” Nature, 518, 529–533.