---
title: "Quickstart"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Quickstart}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.width = 6,
fig.height = 4,
out.width = "100%"
)
set.seed(20260513)
```
```{r setup}
library(proxymix)
```
`proxymix` fits Gaussian-mixture *proxies* to user-supplied target densities. The unified verb is
```r
fit_proxymix(target, N, regime = c("auto", "moment", "sample", "kld"))
```
— and the three fitting regimes come from Hoek and Elliott (2024):
| Regime | When it applies | Method |
|--------------|------------------------------------------------|---------------------------------------|
| `"moment"` | `N == 1` and the target carries samples | Closed-form moment matching |
| `"sample"` | `N >= 2` and the target carries samples | Classical EM |
| `"kld"` | The target carries `log_density` only | **Importance-sampled KLD-EM** |
The `"kld"` regime is the wedge: no other CRAN package fits Gaussian-mixture proxies when you can evaluate `f(x)` but cannot (cheaply) sample from it.
## A target you cannot sample from
We use the bundled "banana" target — a non-Gaussian 2-D shape obtained by warping an isotropic Gaussian through a quadratic. Its log-density is exact and normalised; we deliberately do *not* attach samples, so only regime (iii) applies.
```{r banana}
tgt <- banana_target()
tgt
```
A quick look at the target as a contour grid:
```{r banana-contour, fig.cap = "The banana target density."}
grid_x <- seq(-3, 3, length.out = 120)
grid_y <- seq(-3, 3, length.out = 120)
G <- expand.grid(x1 = grid_x, x2 = grid_y)
G$f <- exp(tgt@log_density(as.matrix(G)))
if (requireNamespace("ggplot2", quietly = TRUE)) {
library(ggplot2)
ggplot(G, aes(x1, x2, z = f)) +
geom_contour_filled(bins = 12L) +
scale_fill_viridis_d(option = "mako", guide = "none") +
coord_equal() +
labs(title = "Banana target", x = expression(x[1]), y = expression(x[2])) +
theme_minimal(base_size = 12)
}
```
## Fit a Gaussian-mixture proxy
We fit a 3-component proxy by KLD-EM with importance sampling against a multivariate-Student-t proposal. The fit runs in a fraction of a second.
```{r fit}
proposal <- is_mvt(n_dim = 2L, mean = c(0, 0),
sigma = 4 * diag(2), df = 5)
fit <- fit_proxymix(tgt, N = 3L, regime = "kld",
proposal = proposal,
is_size = 2000L,
max_iter = 30L,
seed = 1L)
fit
```
Diagnostics:
```{r diagnostics}
data.frame(
iter = seq_along(kld_trace(fit)),
kld = kld_trace(fit)
)
ess_trace(fit)
```
## Overlay the proxy on the target
```{r overlay, fig.cap = "Banana target (filled) with the 3-component proxy overlaid (dashed contours)."}
G$g <- exp(dgmm(as.matrix(G[, c("x1", "x2")]), fit, log = TRUE))
if (requireNamespace("ggplot2", quietly = TRUE)) {
ggplot(G, aes(x1, x2)) +
geom_contour_filled(aes(z = f), bins = 10L, alpha = 0.8) +
geom_contour(aes(z = g), colour = "white", linetype = "dashed",
linewidth = 0.4, bins = 8L) +
scale_fill_viridis_d(option = "mako", guide = "none") +
coord_equal() +
labs(title = "Target (filled) and proxy (dashed)",
x = expression(x[1]), y = expression(x[2])) +
theme_minimal(base_size = 12)
}
```
## Closed-form mixture operations
Because the fit is a Gaussian mixture, marginals and conditionals are closed-form and exact:
```{r ops}
gmm_marginalise(fit, keep = 1L)
gmm_conditionalise(fit, given = c(NA, 0.5))
```
You can sample from the proxy directly:
```{r sample}
x <- rgmm(500L, fit)
head(x)
```
## Where to next
* `vignette("three_regimes")` — what each of regimes (i), (ii), (iii) does, with a head-to-head comparison on a target whose ground truth is known.
* `vignette("density_shapes")` — the wedge demonstration. KLD-EM fit on three different non-Gaussian targets (banana, donut, three-mixture), with KLD / ESS / Hellinger reported.
* `vignette("roadmap")` — what the Tier-2 stubs are for and when they will graduate.
## Reference
Hoek, J. van der and Elliott, R. J. (2024). *Mixtures of multivariate Gaussians.* Stochastic Analysis and Applications. .