Skip to content

Optimisers

Optimisation algorithms for finding parameter values that minimise the objective function.

Overview

Optimiser Type Best For Parameters
NelderMead Gradient-free Local search, < 10 params, noisy 2-10
CMAES Gradient-free Global search, 10-100+ params 1-100+
Adam Gradient-based Smooth objectives, fast convergence Any

Nelder-Mead

NelderMead

Classic simplex-based direct search optimiser.

__new__

__new__()

Create a Nelder-Mead optimiser with default coefficients.

with_step_size

with_step_size(step_size)

Set the initial global step-size (standard deviation).

with_max_iter

with_max_iter(max_iter)

Limit the number of simplex iterations.

with_threshold

with_threshold(threshold)

Set the stopping threshold on simplex size or objective reduction.

with_position_tolerance

with_position_tolerance(tolerance)

Stop once simplex vertices fall within the supplied positional tolerance.

with_max_evaluations

with_max_evaluations(max_evaluations)

Abort after evaluating the objective max_evaluations times.

with_coefficients

with_coefficients(alpha, gamma, rho, sigma)

Override the reflection, expansion, contraction, and shrink coefficients.

with_patience

with_patience(patience)

Abort if the objective fails to improve within the allotted time.

Parameters:

Name Type Description Default
patience float or timedelta

Either seconds (float) or a timedelta object

required

run

run(problem, initial)

Optimise the given problem starting from the provided initial simplex centre.

init

init(initial, bounds=None)

Initialize ask-tell optimization state.

Returns a NelderMeadState object that can be used for incremental optimization via the ask-tell interface.

Parameters:

Name Type Description Default
initial list[float]

Initial parameter vector (simplex center)

required
bounds list[tuple[float, float]]

Parameter bounds as [(lower, upper), ...]. If None, unbounded.

None

Returns:

Type Description
NelderMeadState

State object for ask-tell optimization

Examples:

>>> optimiser = diffid.NelderMead()
>>> state = optimiser.init(initial=[1.0, 2.0])
>>> while True:
...     result = state.ask()
...     if isinstance(result, diffid.Done):
...         break
...     values = [evaluate(pt) for pt in result.points]
...     state.tell(values)

Example Usage

import diffid as chron

# Create optimiser with custom settings
optimiser = (
    diffid.NelderMead()
    .with_max_iter(5000)
    .with_step_size(0.1)
    .with_threshold(1e-6)
)

# Run optimisation
result = optimiser.run(problem, initial_guess=[1.0, 2.0])

When to Use

Advantages:

  • No gradient computation required
  • Robust to noisy objectives
  • Simple and reliable for small problems
  • Default choice for quick exploration

Limitations:

  • Slow convergence for > 10 parameters
  • Can get stuck in local minima
  • Performance degrades with dimensionality

Typical Use Cases:

  • Initial parameter exploration
  • Noisy experimental data
  • Small-scale problems (< 10 parameters)

Parameter Tuning

  • step_size: Initial simplex size (default: 1.0)

    • Larger values explore more globally
    • Smaller values for local refinement
    • Start with 10-50% of parameter range
  • threshold: Convergence tolerance (default: 1e-6)

    • Smaller for higher precision
    • Larger for faster termination
  • max_iter: Maximum iterations (default: 1000)

    • Rule of thumb: 100 * n_parameters minimum

See the Nelder-Mead Algorithm Guide for more details.


CMA-ES

CMAES

Covariance Matrix Adaptation Evolution Strategy optimiser.

__new__

__new__()

Create a CMA-ES optimiser with library defaults.

with_max_iter

with_max_iter(max_iter)

Limit the number of iterations/generations before termination.

with_threshold

with_threshold(threshold)

Set the stopping threshold on the best objective value.

with_step_size

with_step_size(step_size)

Set the initial global step-size (standard deviation).

with_patience

with_patience(patience)

Abort the run if no improvement occurs for the given wall-clock duration.

Parameters:

Name Type Description Default
patience float or timedelta

Either seconds (float) or a timedelta object

required

with_population_size

with_population_size(population_size)

Specify the number of offspring evaluated per generation.

with_seed

with_seed(seed)

Initialise the internal RNG for reproducible runs.

run

run(problem, initial)

Optimise the given problem starting from the provided mean vector.

init

init(initial, bounds=None)

Initialize ask-tell optimization state.

Returns a CMAESState object that can be used for incremental optimization via the ask-tell interface.

Parameters:

Name Type Description Default
initial list[float]

Initial mean vector for the search distribution

required
bounds list[tuple[float, float]]

Parameter bounds as [(lower, upper), ...]. If None, unbounded.

None

Returns:

Type Description
CMAESState

State object for ask-tell optimization

Examples:

>>> optimiser = diffid.CMAES()
>>> state = optimiser.init(initial=[1.0, 2.0])
>>> while True:
...     result = state.ask()
...     if isinstance(result, diffid.Done):
...         break
...     values = [evaluate(pt) for pt in result.points]
...     state.tell(values)

Example Usage

import diffid as chron

# Create CMA-ES optimiser
optimiser = (
    diffid.CMAES()
    .with_max_iter(1000)
    .with_step_size(0.5)
    .with_population_size(20)
    .with_seed(42)  # For reproducibility
)

result = optimiser.run(problem, initial_guess=[0.5, 0.5])

When to Use

Advantages:

  • Global optimisation (avoids local minima)
  • Scales to high dimensions (10-100+ parameters)
  • Parallelisable (evaluates population in parallel)
  • Self-adapting (no gradient tuning needed)

Limitations:

  • More function evaluations than gradient methods
  • Requires population-sized memory
  • Stochastic (results vary between runs)

Typical Use Cases:

  • Global parameter search
  • High-dimensional problems (> 10 parameters)
  • Multi-modal landscapes
  • Parallel hardware available

Parameter Tuning

  • step_size: Initial search radius (default: 1.0)

    • Start with ~⅓ of expected parameter range
    • Too large: slow convergence
    • Too small: premature convergence
  • population_size: Offspring per generation (default: 4 + floor(3*ln(n_params)))

    • Larger populations explore more but cost more
    • Typical range: 10-100
    • Must match available parallelism
  • max_iter: Maximum generations (default: 1000)

    • Each iteration evaluates population_size candidates
    • Total evaluations = max_iter * population_size
  • threshold: Objective value threshold (default: 1e-6)

    • Stop when best value < threshold
  • seed: Random seed for reproducibility

    • Omit for non-deterministic runs
    • Set for reproducible benchmarks

See the CMA-ES Algorithm Guide for more details.


Adam

Adam

Adaptive Moment Estimation (Adam) gradient-based optimiser.

__new__

__new__()

Create an Adam optimiser with library defaults.

with_max_iter

with_max_iter(max_iter)

Limit the maximum number of optimisation iterations.

with_threshold

with_threshold(threshold)

Set the stopping threshold on the gradient norm.

with_step_size

with_step_size(step_size)

Configure the base learning rate / step size.

with_betas

with_betas(beta1, beta2)

Override the exponential decay rates for the first and second moments.

with_eps

with_eps(eps)

Override the numerical stability constant added to the denominator.

with_patience

with_patience(patience)

Abort the run once the patience window has elapsed.

Parameters:

Name Type Description Default
patience float or timedelta

Either seconds (float) or a timedelta object

required

run

run(problem, initial)

Optimise the given problem using Adam starting from the provided point.

init

init(initial, bounds=None)

Initialize ask-tell optimization state.

Returns an AdamState object that can be used for incremental optimization via the ask-tell interface.

Parameters:

Name Type Description Default
initial list[float]

Initial parameter vector

required
bounds list[tuple[float, float]]

Parameter bounds as [(lower, upper), ...]. If None, unbounded.

None

Returns:

Type Description
AdamState

State object for ask-tell optimization

Examples:

>>> optimiser = diffid.Adam()
>>> state = optimiser.init(initial=[1.0, 2.0])
>>> while True:
...     result = state.ask()
...     if isinstance(result, diffid.Done):
...         break
...     values = [evaluate_with_gradient(pt) for pt in result.points]
...     state.tell(values)

Example Usage

import diffid as chron

# Create Adam optimiser
optimiser = (
    diffid.Adam()
    .with_max_iter(5000)
    .with_step_size(0.01)  # Learning rate
    .with_betas(0.9, 0.999)
    .with_threshold(1e-6)
)

result = optimiser.run(problem, initial_guess=[1.0, 2.0])

When to Use

Advantages:

  • Fast convergence on smooth objectives
  • Adaptive learning rate
  • Well-suited for large-scale problems
  • Efficient (uses gradients)

Limitations:

  • Requires automatic differentiation
  • Can get stuck in local minima
  • Sensitive to learning rate tuning

Typical Use Cases:

  • Smooth, differentiable objectives
  • Large-scale problems
  • When gradients are available or cheap to compute

Parameter Tuning

  • step_size: Learning rate (default: 0.001)

    • Most critical parameter
    • Too large: oscillation or divergence
    • Too small: slow convergence
    • Try: 0.1, 0.01, 0.001, 0.0001
  • betas: Momentum decay rates (default: (0.9, 0.999))

    • beta1: First moment (mean) decay
    • beta2: Second moment (variance) decay
    • Rarely need tuning, defaults work well
  • eps: Numerical stability constant (default: 1e-8)

    • Prevents division by zero
    • Almost never needs tuning
  • threshold: Gradient norm threshold (default: 1e-6)

    • Stop when gradient is small
    • Smaller for higher precision

See the Adam Algorithm Guide for more details.


Common Patterns

Running Optimisation

All optimisers have a .run() method:

result = optimiser.run(problem, initial_guess)

For the default optimiser (Nelder-Mead):

result = problem.optimise()  # Uses Nelder-Mead with defaults

Configuring Stopping Criteria

optimiser = (
    diffid.CMAES()
    .with_max_iter(10000)         # Maximum iterations
    .with_threshold(1e-8)         # Objective threshold
    .with_patience(300.0)         # Patience in seconds
)

The optimiser stops when: 1. max_iter iterations reached, OR 2. Objective value < threshold, OR 3. patience seconds elapsed without improvement

Reproducibility

For stochastic optimisers (CMA-ES), set a seed:

optimiser = diffid.CMAES().with_seed(42)
result1 = optimiser.run(problem, [0.0, 0.0])
result2 = optimiser.run(problem, [0.0, 0.0])
# result1 == result2 (same random sequence)

Warm Starts

Run multiple optimisations with different starting points:

import numpy as np

initial_guesses = [
    [1.0, 1.0],
    [-1.0, -1.0],
    [0.0, 2.0],
]

results = [optimiser.run(problem, guess) for guess in initial_guesses]
best_result = min(results, key=lambda r: r.value)

Choosing an Optimiser

graph TD
    A[Start] --> B{Gradients available?}
    B -->|Yes| C[Adam]
    B -->|No| D{Problem size?}
    D -->|< 10 params| E[Nelder-Mead]
    D -->|> 10 params| F[CMA-ES]
    D -->|Need global search| F

For detailed guidance, see: - Choosing an Optimiser Guide - Tuning Optimisers Guide

See Also