Optimisers¶
Optimisation algorithms for finding parameter values that minimise the objective function.
Overview¶
| Optimiser | Type | Best For | Parameters |
|---|---|---|---|
NelderMead | Gradient-free | Local search, < 10 params, noisy | 2-10 |
CMAES | Gradient-free | Global search, 10-100+ params | 1-100+ |
Adam | Gradient-based | Smooth objectives, fast convergence | Any |
Nelder-Mead¶
NelderMead ¶
Classic simplex-based direct search optimiser.
with_threshold ¶
Set the stopping threshold on simplex size or objective reduction.
with_position_tolerance ¶
Stop once simplex vertices fall within the supplied positional tolerance.
with_max_evaluations ¶
Abort after evaluating the objective max_evaluations times.
with_coefficients ¶
Override the reflection, expansion, contraction, and shrink coefficients.
with_patience ¶
Abort if the objective fails to improve within the allotted time.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
patience | float or timedelta | Either seconds (float) or a timedelta object | required |
run ¶
Optimise the given problem starting from the provided initial simplex centre.
init ¶
Initialize ask-tell optimization state.
Returns a NelderMeadState object that can be used for incremental optimization via the ask-tell interface.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
initial | list[float] | Initial parameter vector (simplex center) | required |
bounds | list[tuple[float, float]] | Parameter bounds as [(lower, upper), ...]. If None, unbounded. | None |
Returns:
| Type | Description |
|---|---|
NelderMeadState | State object for ask-tell optimization |
Examples:
Example Usage¶
import diffid as chron
# Create optimiser with custom settings
optimiser = (
diffid.NelderMead()
.with_max_iter(5000)
.with_step_size(0.1)
.with_threshold(1e-6)
)
# Run optimisation
result = optimiser.run(problem, initial_guess=[1.0, 2.0])
When to Use¶
Advantages:
- No gradient computation required
- Robust to noisy objectives
- Simple and reliable for small problems
- Default choice for quick exploration
Limitations:
- Slow convergence for > 10 parameters
- Can get stuck in local minima
- Performance degrades with dimensionality
Typical Use Cases:
- Initial parameter exploration
- Noisy experimental data
- Small-scale problems (< 10 parameters)
Parameter Tuning¶
-
step_size: Initial simplex size (default: 1.0)- Larger values explore more globally
- Smaller values for local refinement
- Start with 10-50% of parameter range
-
threshold: Convergence tolerance (default: 1e-6)- Smaller for higher precision
- Larger for faster termination
-
max_iter: Maximum iterations (default: 1000)- Rule of thumb:
100 * n_parametersminimum
- Rule of thumb:
See the Nelder-Mead Algorithm Guide for more details.
CMA-ES¶
CMAES ¶
Covariance Matrix Adaptation Evolution Strategy optimiser.
with_max_iter ¶
Limit the number of iterations/generations before termination.
with_patience ¶
Abort the run if no improvement occurs for the given wall-clock duration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
patience | float or timedelta | Either seconds (float) or a timedelta object | required |
with_population_size ¶
Specify the number of offspring evaluated per generation.
init ¶
Initialize ask-tell optimization state.
Returns a CMAESState object that can be used for incremental optimization via the ask-tell interface.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
initial | list[float] | Initial mean vector for the search distribution | required |
bounds | list[tuple[float, float]] | Parameter bounds as [(lower, upper), ...]. If None, unbounded. | None |
Returns:
| Type | Description |
|---|---|
CMAESState | State object for ask-tell optimization |
Examples:
Example Usage¶
import diffid as chron
# Create CMA-ES optimiser
optimiser = (
diffid.CMAES()
.with_max_iter(1000)
.with_step_size(0.5)
.with_population_size(20)
.with_seed(42) # For reproducibility
)
result = optimiser.run(problem, initial_guess=[0.5, 0.5])
When to Use¶
Advantages:
- Global optimisation (avoids local minima)
- Scales to high dimensions (10-100+ parameters)
- Parallelisable (evaluates population in parallel)
- Self-adapting (no gradient tuning needed)
Limitations:
- More function evaluations than gradient methods
- Requires population-sized memory
- Stochastic (results vary between runs)
Typical Use Cases:
- Global parameter search
- High-dimensional problems (> 10 parameters)
- Multi-modal landscapes
- Parallel hardware available
Parameter Tuning¶
-
step_size: Initial search radius (default: 1.0)- Start with ~⅓ of expected parameter range
- Too large: slow convergence
- Too small: premature convergence
-
population_size: Offspring per generation (default:4 + floor(3*ln(n_params)))- Larger populations explore more but cost more
- Typical range: 10-100
- Must match available parallelism
-
max_iter: Maximum generations (default: 1000)- Each iteration evaluates
population_sizecandidates - Total evaluations =
max_iter * population_size
- Each iteration evaluates
-
threshold: Objective value threshold (default: 1e-6)- Stop when best value < threshold
-
seed: Random seed for reproducibility- Omit for non-deterministic runs
- Set for reproducible benchmarks
See the CMA-ES Algorithm Guide for more details.
Adam¶
Adam ¶
Adaptive Moment Estimation (Adam) gradient-based optimiser.
with_betas ¶
Override the exponential decay rates for the first and second moments.
with_patience ¶
Abort the run once the patience window has elapsed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
patience | float or timedelta | Either seconds (float) or a timedelta object | required |
init ¶
Initialize ask-tell optimization state.
Returns an AdamState object that can be used for incremental optimization via the ask-tell interface.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
initial | list[float] | Initial parameter vector | required |
bounds | list[tuple[float, float]] | Parameter bounds as [(lower, upper), ...]. If None, unbounded. | None |
Returns:
| Type | Description |
|---|---|
AdamState | State object for ask-tell optimization |
Examples:
Example Usage¶
import diffid as chron
# Create Adam optimiser
optimiser = (
diffid.Adam()
.with_max_iter(5000)
.with_step_size(0.01) # Learning rate
.with_betas(0.9, 0.999)
.with_threshold(1e-6)
)
result = optimiser.run(problem, initial_guess=[1.0, 2.0])
When to Use¶
Advantages:
- Fast convergence on smooth objectives
- Adaptive learning rate
- Well-suited for large-scale problems
- Efficient (uses gradients)
Limitations:
- Requires automatic differentiation
- Can get stuck in local minima
- Sensitive to learning rate tuning
Typical Use Cases:
- Smooth, differentiable objectives
- Large-scale problems
- When gradients are available or cheap to compute
Parameter Tuning¶
-
step_size: Learning rate (default: 0.001)- Most critical parameter
- Too large: oscillation or divergence
- Too small: slow convergence
- Try: 0.1, 0.01, 0.001, 0.0001
-
betas: Momentum decay rates (default: (0.9, 0.999))beta1: First moment (mean) decaybeta2: Second moment (variance) decay- Rarely need tuning, defaults work well
-
eps: Numerical stability constant (default: 1e-8)- Prevents division by zero
- Almost never needs tuning
-
threshold: Gradient norm threshold (default: 1e-6)- Stop when gradient is small
- Smaller for higher precision
See the Adam Algorithm Guide for more details.
Common Patterns¶
Running Optimisation¶
All optimisers have a .run() method:
For the default optimiser (Nelder-Mead):
Configuring Stopping Criteria¶
optimiser = (
diffid.CMAES()
.with_max_iter(10000) # Maximum iterations
.with_threshold(1e-8) # Objective threshold
.with_patience(300.0) # Patience in seconds
)
The optimiser stops when: 1. max_iter iterations reached, OR 2. Objective value < threshold, OR 3. patience seconds elapsed without improvement
Reproducibility¶
For stochastic optimisers (CMA-ES), set a seed:
optimiser = diffid.CMAES().with_seed(42)
result1 = optimiser.run(problem, [0.0, 0.0])
result2 = optimiser.run(problem, [0.0, 0.0])
# result1 == result2 (same random sequence)
Warm Starts¶
Run multiple optimisations with different starting points:
import numpy as np
initial_guesses = [
[1.0, 1.0],
[-1.0, -1.0],
[0.0, 2.0],
]
results = [optimiser.run(problem, guess) for guess in initial_guesses]
best_result = min(results, key=lambda r: r.value)
Choosing an Optimiser¶
graph TD
A[Start] --> B{Gradients available?}
B -->|Yes| C[Adam]
B -->|No| D{Problem size?}
D -->|< 10 params| E[Nelder-Mead]
D -->|> 10 params| F[CMA-ES]
D -->|Need global search| F For detailed guidance, see: - Choosing an Optimiser Guide - Tuning Optimisers Guide