An objective \(f(x)\) to be
minimised over a box defines an unnormalised density, the Gibbs measure
\(\exp(-f(x) / T)\), whose mass
concentrates on the low regions of \(f\) as the temperature \(T\) falls. That density can be evaluated
point-wise but not directly sampled, which is exactly regime (iii) of
Hoek and Elliott (2024) – “the wedge”. from_objective()
fits a Gaussian-mixture proxy to \(\exp(-f /
T)\) by cooling a short temperature ladder through the same
importance-sampled KLD-EM that fit_kld_em() runs,
warm-starting each step from the previous fit.
The point of the construction is not a single best point but the
whole picture. A multimodal \(f\) gives
a multimodal Gibbs measure, so the fitted mixture spreads its components
across the basins and recovers the optima together. The result
is an ordinary gmm, so the closed-form operator calculus –
marginalisation, conditioning, divergence – applies to the map of
solutions. gmm_modes() reads the distinct optima off that
map.
The objective \((\theta^2 - 4)^2\) has two minima, at \(\theta = -2\) and \(\theta = 2\). We fit the map and resolve it.
f <- function(v) (v[1]^2 - 4)^2
fit <- from_objective(f, lower = -5, upper = 5, N = 6L,
is_size = 3000L, n_steps = 5L, seed = 1L)
modes <- gmm_modes(fit)
sort(round(modes$modes[, 1], 3))
#> [1] -2.031 1.959Both minima are recovered. The fitted object is a
gmm_fit, so the importance-sampling diagnostics are
available alongside the map.
ess_summary(fit)
#> $is_size
#> [1] 3000
#>
#> $ess
#> [1] 1633.422
#>
#> $ess_relative
#> [1] 0.544474
#>
#> $max_weight
#> [1] 0.0008486841
#>
#> $support_fraction
#> [1] 1
#>
#> $mc_se_kld
#> [1] 0.001791462
#>
#> $validation_size
#> [1] 0
#>
#> $validation_ess
#> [1] NA
#>
#> $validation_ess_relative
#> [1] NA
#>
#> $validation_max_weight
#> [1] NA
#>
#> $validation_support_fraction
#> [1] NA
#>
#> $validation_kld
#> [1] NAHimmelblau’s function has four equal minima. A single-point optimiser
would return one of them and miss the rest.
from_objective() returns all four in one fit, provided the
number of components N is comfortably larger than the
number of optima so that a component can settle on each basin.
himmelblau <- function(v) {
x <- v[1]
y <- v[2]
(x * x + y - 11)^2 + (x + y * y - 7)^2
}
fit2 <- from_objective(himmelblau, lower = c(-5, -5), upper = c(5, 5),
N = 10L, is_size = 4000L, n_steps = 6L, seed = 1L)
found <- gmm_modes(fit2)
round(found$modes[order(found$modes[, 1]), ], 3)
#> [,1] [,2]
#> [1,] -3.731 -3.225
#> [2,] -2.784 3.140
#> [3,] 3.031 1.883
#> [4,] 3.561 -1.792The four recovered minima sit at \((3, 2)\), \((-2.81, 3.13)\), \((-3.78, -3.28)\) and \((3.58, -1.85)\) – the known optima. The figure below overlays the recovered modes (orange) and the analytic truth (red crosses) on the log-objective surface.
truth <- rbind(c(3, 2), c(-2.805118, 3.131312),
c(-3.779310, -3.283186), c(3.584428, -1.848126))
xs <- seq(-5, 5, length.out = 140)
ys <- seq(-5, 5, length.out = 140)
zz <- outer(xs, ys, Vectorize(function(a, b) himmelblau(c(a, b))))
image(xs, ys, log1p(zz), col = hcl.colors(40, "YlGnBu", rev = TRUE),
xlab = expression(x[1]), ylab = expression(x[2]),
main = "Himmelblau: the map over the four minima")
contour(xs, ys, log1p(zz), add = TRUE, col = "grey60",
nlevels = 8, drawlabels = FALSE)
points(truth[, 1], truth[, 2], pch = 4, col = "red", cex = 2, lwd = 3)
points(found$modes[, 1], found$modes[, 2], pch = 19,
col = "#D95F0E", cex = 1.4)
legend("topright", bg = "white",
legend = c("analytic optima", "recovered modes"),
pch = c(4, 19), col = c("red", "#D95F0E"),
pt.cex = c(1.4, 1.2), pt.lwd = c(3, 1))N is larger than the number of optima. Symmetric
landscapes, where several optima are exchangeable, need the most
headroom. When in doubt raise N.lower
and upper set where the optima are sought and scale the
temperature ladder, the uniform exploration of the proposal, and the
initialisation. The objective may be evaluated outside the box during
fitting, where non-finite values are treated as a finite penalty.minimise = FALSE to
concentrate the map on the maxima of f instead of the
minima.gmm, the rest of the package applies. The optima can be
conditioned, marginalised, or compared with
gmm_divergence() like any other fitted mixture.Hoek, J. van der and Elliott, R. J. (2024). Mixtures of multivariate Gaussians. Stochastic Analysis and Applications. doi:10.1080/07362994.2024.2372605.
Carreira-Perpinan, M. A. (2000). Mode-finding for mixtures of Gaussian distributions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11), 1318-1323.