--- title: "Mapping the optima of an objective" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Mapping the optima of an objective} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 5, out.width = "100%" ) set.seed(20260619) ``` ```{r setup} library(proxymix) ``` # An objective is a density you can evaluate but not sample An objective $f(x)$ to be minimised over a box defines an unnormalised density, the Gibbs measure $\exp(-f(x) / T)$, whose mass concentrates on the low regions of $f$ as the temperature $T$ falls. That density can be evaluated point-wise but not directly sampled, which is exactly regime (iii) of Hoek and Elliott (2024) -- "the wedge". `from_objective()` fits a Gaussian-mixture proxy to $\exp(-f / T)$ by cooling a short temperature ladder through the same importance-sampled KLD-EM that `fit_kld_em()` runs, warm-starting each step from the previous fit. The point of the construction is not a single best point but the whole picture. A multimodal $f$ gives a multimodal Gibbs measure, so the fitted mixture spreads its components across the basins and recovers the optima *together*. The result is an ordinary `gmm`, so the closed-form operator calculus -- marginalisation, conditioning, divergence -- applies to the map of solutions. `gmm_modes()` reads the distinct optima off that map. # A bimodal objective in one dimension The objective $(\theta^2 - 4)^2$ has two minima, at $\theta = -2$ and $\theta = 2$. We fit the map and resolve it. ```{r bimodal} f <- function(v) (v[1]^2 - 4)^2 fit <- from_objective(f, lower = -5, upper = 5, N = 6L, is_size = 3000L, n_steps = 5L, seed = 1L) modes <- gmm_modes(fit) sort(round(modes$modes[, 1], 3)) ``` Both minima are recovered. The fitted object is a `gmm_fit`, so the importance-sampling diagnostics are available alongside the map. ```{r bimodal-ess} ess_summary(fit) ``` # Four minima at once: Himmelblau Himmelblau's function has four equal minima. A single-point optimiser would return one of them and miss the rest. `from_objective()` returns all four in one fit, provided the number of components `N` is comfortably larger than the number of optima so that a component can settle on each basin. ```{r himmelblau} himmelblau <- function(v) { x <- v[1] y <- v[2] (x * x + y - 11)^2 + (x + y * y - 7)^2 } fit2 <- from_objective(himmelblau, lower = c(-5, -5), upper = c(5, 5), N = 10L, is_size = 4000L, n_steps = 6L, seed = 1L) found <- gmm_modes(fit2) round(found$modes[order(found$modes[, 1]), ], 3) ``` The four recovered minima sit at $(3, 2)$, $(-2.81, 3.13)$, $(-3.78, -3.28)$ and $(3.58, -1.85)$ -- the known optima. The figure below overlays the recovered modes (orange) and the analytic truth (red crosses) on the log-objective surface. ```{r himmelblau-map, fig.alt = "Log-objective surface of Himmelblau's function with the four recovered modes and the analytic optima overlaid."} truth <- rbind(c(3, 2), c(-2.805118, 3.131312), c(-3.779310, -3.283186), c(3.584428, -1.848126)) xs <- seq(-5, 5, length.out = 140) ys <- seq(-5, 5, length.out = 140) zz <- outer(xs, ys, Vectorize(function(a, b) himmelblau(c(a, b)))) image(xs, ys, log1p(zz), col = hcl.colors(40, "YlGnBu", rev = TRUE), xlab = expression(x[1]), ylab = expression(x[2]), main = "Himmelblau: the map over the four minima") contour(xs, ys, log1p(zz), add = TRUE, col = "grey60", nlevels = 8, drawlabels = FALSE) points(truth[, 1], truth[, 2], pch = 4, col = "red", cex = 2, lwd = 3) points(found$modes[, 1], found$modes[, 2], pch = 19, col = "#D95F0E", cex = 1.4) legend("topright", bg = "white", legend = c("analytic optima", "recovered modes"), pch = c(4, 19), col = c("red", "#D95F0E"), pt.cex = c(1.4, 1.2), pt.lwd = c(3, 1)) ``` # Notes on use * **Component headroom.** Recovery is most reliable when `N` is larger than the number of optima. Symmetric landscapes, where several optima are exchangeable, need the most headroom. When in doubt raise `N`. * **The box is the search region.** `lower` and `upper` set where the optima are sought and scale the temperature ladder, the uniform exploration of the proposal, and the initialisation. The objective may be evaluated outside the box during fitting, where non-finite values are treated as a finite penalty. * **Maximisation.** Set `minimise = FALSE` to concentrate the map on the maxima of `f` instead of the minima. * **The map is a mixture.** Because the return value is a `gmm`, the rest of the package applies. The optima can be conditioned, marginalised, or compared with `gmm_divergence()` like any other fitted mixture. # References Hoek, J. van der and Elliott, R. J. (2024). *Mixtures of multivariate Gaussians.* Stochastic Analysis and Applications. [doi:10.1080/07362994.2024.2372605](https://doi.org/10.1080/07362994.2024.2372605). Carreira-Perpinan, M. A. (2000). Mode-finding for mixtures of Gaussian distributions. *IEEE Transactions on Pattern Analysis and Machine Intelligence*, 22(11), 1318-1323.