Metropolis-Hastings Algorithm¶
Metropolis-Hastings is a Markov Chain Monte Carlo (MCMC) algorithm for sampling from posterior distributions. It enables uncertainty quantification by generating samples that characterise the probability distribution over parameters.
Algorithm Overview¶
The algorithm constructs a Markov chain whose stationary distribution is the target posterior. By proposing random moves and accepting/rejecting them based on the likelihood ratio, it explores the parameter space in proportion to posterior probability.
Key Properties¶
| Property | Value |
|---|---|
| Type | MCMC sampler |
| Parallelisable | Yes (across chains) |
| Output | Posterior samples |
| Best for | Uncertainty quantification |
Mathematical Foundation¶
Given a target distribution \(\pi(\theta)\) (the posterior), the algorithm:
- Proposes a new state from proposal distribution \(q(\theta' | \theta)\)
- Computes acceptance probability:
For the symmetric random walk proposal used here (\(q(\theta' | \theta) = q(\theta | \theta')\)):
Algorithm Steps¶
- Initialise: Start at initial point \(\theta_0\)
- Propose: Generate candidate \(\theta' = \theta_t + \sigma \cdot \mathcal{N}(0, I)\)
- Evaluate: Compute log-likelihood \(\log\mathcal{L}(\theta')\)
- Accept/Reject:
- Draw \(u \sim \text{Uniform}(0, 1)\)
- If \(u < \exp(\log\mathcal{L}(\theta') - \log\mathcal{L}(\theta_t))\): accept \(\theta_{t+1} = \theta'\)
- Else: reject \(\theta_{t+1} = \theta_t\)
- Repeat: For specified number of iterations
- Return: Chain of samples \(\{\theta_0, \theta_1, \ldots, \theta_T\}\)
Parameters¶
| Parameter | Default | Description |
|---|---|---|
iterations | 1000 | MCMC iterations per chain |
num_chains | 1 | Number of independent chains |
step_size | 0.1 | Proposal distribution width \(\sigma\) |
seed | None | Random seed for reproducibility |
Tuning Guidance¶
Step size controls exploration efficiency:
- Target acceptance rate: 20-50% (optimal ~23% for high dimensions)
- Too large: low acceptance rate, chain gets stuck
- Too small: high acceptance rate but slow mixing
- Tune by monitoring acceptance ratio
Number of chains aids convergence diagnostics:
- Multiple chains from different starting points detect convergence issues
- Enables Gelman-Rubin \(\hat{R}\) diagnostic
- More chains = more robust inference but higher cost
Iterations should be sufficient for:
- Burn-in period (discard initial samples)
- Effective sample size for reliable estimates
Convergence Diagnostics¶
After sampling, assess convergence:
- Trace plots: Visual inspection for stationarity
- Acceptance rate: Should be 20-50%
- Autocorrelation: Lower is better; high autocorrelation means inefficient sampling
- \(\hat{R}\) statistic: Should be < 1.1 for all parameters (requires multiple chains)
When to Use¶
Strengths:
- Full posterior characterisation
- Uncertainty quantification (credible intervals)
- Parameter correlation analysis
- Simple and robust
Limitations:
- Many function evaluations required
- Can be slow for high-dimensional problems
- Requires careful tuning for efficiency
- Does not compute model evidence (use nested sampling)
Cost Metric Requirement¶
Metropolis-Hastings requires a negative log-likelihood cost metric:
problem = (
diffid.ScalarProblemBuilder()
.with_objective(model_fn)
.with_cost_metric(diffid.CostMetric.GaussianNLL) # Required
.build()
)
Example¶
import diffid as chron
sampler = (
diffid.MetropolisHastings()
.with_iterations(5000)
.with_num_chains(4)
.with_step_size(0.05)
.with_seed(42)
)
result = sampler.run(problem, initial_guess=[1.0, 2.0])
# Access samples
samples = result.samples # Shape: (n_chains * iterations, n_params)
References¶
- Metropolis, N. et al. (1953). "Equation of State Calculations by Fast Computing Machines". Journal of Chemical Physics, 21(6), 1087-1092.
- Hastings, W.K. (1970). "Monte Carlo Sampling Methods Using Markov Chains and Their Applications". Biometrika, 57(1), 97-109.
- Gelman, A. et al. (2013). Bayesian Data Analysis. 3rd ed. Chapman & Hall/CRC.