Adds two opt-in countermeasures to the finite-ensemble pathologies that
make an iterative ensemble smoother over-confident: covariance inflation
(against under-dispersion / ensemble collapse) and covariance
localisation (against spurious finite-sample parameter-observation
correlations). Both default to off; a NULL specification leaves
pesto_ies_callback() and pesto_ies_filter() byte-identical to the
previous release.
pesto_inflation() -- inflation specification with four methods:
"rtps" (relaxation to prior spread, Whitaker & Hamill 2012; the
per-parameter, spectrally-aware workhorse), "adaptive" (global
inflation targeting a spread-retention floor), "multiplicative"
(fixed factor), and "none".pesto_localisation() -- localisation specification: "correlation"
(automatic, coordinate-free, Luo & Bhakta 2020 -- the recommended
default for parameter problems with no spatial metric) or "distance"
(classical Gaspari-Cohn taper of a parameter-to-observation distance
matrix).ensemble_spread_ess() -- the collapse diagnostic: the spectral
participation ratio of the parameter anomaly covariance, i.e. the
effective number of variance-carrying directions. Recorded on every
iteration regardless of method.correlation_localisation(), gaspari_cohn(),
ensemble_solution_localised() -- the C++ kernels backing the above.
ensemble_solution_localised() is the explicit-gain GLM update that
hosts the Schur-product localisation the SVD kernel cannot; with no
taper it reproduces ensemble_solution() (approximate form) to
truncation tolerance.pesto_ies_callback() and pesto_ies_filter() gain inflation and
localisation arguments (both NULL by default) and now record the
spread-ESS and (when active) inflation / localisation diagnostics in
their per-step metadata, which flow into the ensemble manifest.NULL localisation with use_approx = FALSE warns and drops the
null-space correction.Note on terminology: the spectral spread-ESS is scale-invariant, so it is
used as the collapse diagnostic, while the "adaptive" inflation
targets a variance-magnitude retention floor; "rtps" is the method
that reshapes the spectrum. See the Countering Ensemble Collapse:
Inflation and Localisation vignette.
Promotes the two-adapter forward-model contract from an implicit
convention to a typed, enforceable object, and makes the multi-fidelity
(cheap, expensive) bridge first-class (APSIM-bridge invariants 1 and
3). No breaking changes to existing calls: pesto_ies_callback() still
accepts a bare function(theta) -> obs.
pesto_forward_model() — an S7 class wrapping a forward callable with
its output dimensionality, expected parameter names, failure policy
(on_failure, max_fail_frac), evaluation strategy (serial /
"multicore" / custom map_fn), and a fidelity tag. This is the
single contract both the native-callback and .pst-file adapter modes
honour.pesto_evaluate() — generic that runs a forward model (or a
multi-fidelity model at a chosen level) and returns a
shape-guaranteed nreal x nobs matrix with "n_failures" /
"fail_idx" attributes.as_forward_model() — coerces a bare function (or passes through an
existing object) into the contract; used internally so bare functions
keep working unchanged.pesto_multifidelity_model() — an ordered stack of fidelity levels
(cheapest first) plus relative costs; the first-class form of the
bridge's fidelity vector.mf_control_variate() — the affine (Kennedy-O'Hagan AR(1))
control-variate primitive that debiases a cheap level against a sparse
expensive sample; the plug-in point for surrogate cascades.pesto_ies_filter() — a filtering counterpart to the batch smoother
pesto_ies_callback(). It assimilates time-ordered observation
windows one after another against a static parameter ensemble, the
posterior of each window becoming the prior of the next, so a tightening
parameter posterior is available after every window (the in-season
assimilation case). It reuses the forward-model contract (parallel- and
multi-fidelity-ready via a per-window fidelity_schedule) and the C++
ensemble_solution() kernel; window_noptmax > 1 gives an iterated
filter per window. The result records a per-window history including the
per-parameter ensemble standard deviation (the tightening trace).pesto_ies_filter_result) flow into the manifest
contract: as_manifest() tags them method = "ies_filter" (added to the
pesto_ensemble_manifest validator) and carries their fidelity
provenance, so a filtered ensemble is a first-class scenario for the
downstream kernR consumers.pesto_ies_callback() gains fidelity_schedule (consulted only for a
pesto_multifidelity_model): the fidelity level evaluated at each
iteration, supporting cheap-early / expensive-late ramping. The final
ensemble refresh always uses the highest fidelity.pesto_ies_callback() run records its realised schedule
in the result ($fidelity = list(type, schedule, final_level, n_levels, costs)), as_manifest() inherits it into the pesto_ensemble_manifest
fidelity slot unless overridden, and write_manifest() /
read_manifest() round-trip the structured record faithfully (it is
outside the integrity hash, so it does not affect verify_manifest()).
Single-fidelity runs record NULL, so their manifests are unchanged.
The manifest fidelity slot is now documented as a structured
provenance list (legacy named-numeric tags are still accepted on read).pesto_forward_model
with parallel = "multicore" dispatches realisations across forked
workers via parallel::mclapply() with L'Ecuyer streams (reproducible
under RNGkind("L'Ecuyer-CMRG")); serial bulk evaluation is unchanged
and remains the default.apsim_callback() now writes each realisation to a unique per-run
file, making the closure safe to drive in parallel (wrap it in a
pesto_forward_model(parallel = "multicore"))..eval_forward_safe)
was retired in favour of the shared engine behind pesto_evaluate();
the on-error abort message changed from `forward_model` failed to
forward model failed.parallel (a base R package) added to Imports.A code-aesthetics and review-readability patch on top of 0.4.0. No
runtime behaviour changes; no exported-API changes; no shipped-data
changes. The aim is to lift every source surface to the bar set out in
r_style.md ahead of any AAGI-AUS push.
R/internal_validation.R introduces the shared primitive validators
(.assert_positive_scalar(), .assert_nonneg_scalar(),
.assert_character_scalar(), .assert_logical_scalar(),
.assert_path_exists(), .assert_matrix(),
.assert_numeric_vector(), .assert_function(),
.assert_data_frame(), .assert_choice(), .assert_same_ncol(),
.assert_same_nrow(), .assert_required_cols()). All
@noRd @keywords internal; every helper signals failure via
stop(call. = FALSE, ...) with a backticked argument name.apsim_callback.R, pesto_reference_ies.R,
pesto_run.R (pesto_ies_callback), pst_io.R, scenario.R,
surrogate.R, manifest.R, ensemble_io.R, plot.R, and
check_surrogate_regime.R now open with .check_* / .assert_*
calls instead of inline if (!is.x) stop(...) walls.pesto_ies_callback, pesto_ies,
pesto_glm, pesto_sweep, pesto_sensitivity, read_pst,
write_pst, apsim_callback, pesto_reference_ies,
.find_pestpp_exe) now carry Sparks-style dash-banner section
comments that paragraph the work (validate inputs / resolve paths /
iterate / parse outputs / assemble result).@importFrom ggplot2 annotations consolidated into
R/pesto-package.R; the per-function @importFrom annotation on
plot_phi() has been removed in favour of inline ggplot2::
qualification at the call sites.vignettes/apsim-callback.Rmd and
vignettes/ensemble-manifest.Rmd converted to . Capital joins per
manuscript_style.md invariant 5.Dependencies, Contributing, and Acknowledgements
sections per the AAGI repository-guidelines README contract.
Citation block bumped to R package version 0.4.1.DESCRIPTION URL and BugReports, CITATION.cff,
codemeta.json, inst/CITATION, _pkgdown.yml, and the README
citation + issues links now point to
https://github.com/AAGI-AUS/PESTO (canon checklist item 5).
README install instructions and the personal r-universe URL are
retained at max578/PESTO and https://max578.r-universe.dev as
interim distribution infrastructure until the AAGI-AUS push lands.This release contains no R, C++, or shipped-data changes. It is a
governance, metadata, and project-hygiene release that lands the AAGI
canon recipes on the max578/PESTO channel.
AAGI-AUS/PESTO to
max578/PESTO. DESCRIPTION, CITATION.cff, codemeta.json,
inst/CITATION, _pkgdown.yml, README.md, CONTRIBUTING.md,
API_STABILITY.md, and the pkgdown GitHub Actions workflow header
now point to https://github.com/max578/PESTO and
https://max578.github.io/PESTO. The aagi git remote is retained
as a frozen read-only mirror; no push to AAGI-AUS without explicit
per-instance maintainer approval.CLAUDE.md declares aagi_aus: out-of-scope so the
AAGI-AUS canon signal-detection deactivates for this package. The
file is excluded from R-package builds via .Rbuildignore.man/PESTO-package.Rd regenerated to inherit the new URLs from
DESCRIPTION via devtools::document().CITATION.cff (version: 0.1.0), codemeta.json (version: "0.1.0"),
inst/CITATION (R package version 0.3.3), and the README citation
block were not in lock-step with DESCRIPTION. All four are now on
0.4.0 with date-released: "2026-05-28" and
dateModified: "2026-05-28".codemeta.json copyrightHolder corrected from
Organization "Supremum Consulting Ltd" to
Person "Max Moldovan" with ORCID 0000-0001-9680-8474 and
University of Adelaide affiliation, matching Authors@R and
LICENSE.md.CODE_OF_CONDUCT.md (Contributor Covenant v2.1, pointer form).SECURITY.md (vulnerability reporting policy; maintainer email,
five-working-day acknowledgement, scope statement).air.toml (Air formatter configuration: 80-char line width,
two-space indent, auto line endings)..lintr (lintr defaults aligned with r_style.md direction:
80-char line, snake_case / dotted.case / symbols object names,
two-space indent; src, tools, inst/extdata, vignettes
excluded)..Rbuildignore already excluded all four paths; no tarball impact.src/Makevars PKG_LIBS now follows $(BLAS_LIBS) with
$(FLIBS) per Writing R Extensions §1.2.1.5. Resolves the
pre-existing structural WARNING
("apparently using $(BLAS_LIBS) without following $(FLIBS) in
'src/Makevars'") that had survived several check passes as a
"documented local-env artifact". Makevars.win was already
correct; no change needed there.R_ext/Boolean.h:62 -Wfixed-enum-extension pragma) is in R's
own header on this toolchain version and is not present on CRAN's
build farm; it persists harmlessly.write_manifest(format = "csv") is renamed to
write_manifest(format = "csv_unverified") to flag the weaker
integrity contract at every call-site. The mode itself is unchanged
(CSV-only sidecars, hash recorded but not disk-verifiable).integrity: verifiable | not_verifiable
derived from format. Verifiable: rds, both. Not verifiable:
csv_unverified. Lets non-R downstream tools (paper-skill graders,
Python pipelines) branch on the integrity contract without parsing
the PESTO-specific format vocabulary.format = "csv" spelling is still accepted at the API
boundary with a deprecation warning; the persisted form always uses
csv_unverified. read_manifest() normalises old YAMLs on the
read side, so 0.3.1 manifests round-trip cleanly under 0.3.2.{rds, both, csv_unverified}. The old "csv" token is rejected at
slot-set time (only the renamed argument accepts it, with a warning).integrity: YAML field for verifiable modes.ensemble-manifest.Rmd reframed: explicit "Inspection
CSVs (verifiable, via format = 'both')" vs. "Unverified CSV export
(via format = 'csv_unverified')" sections; the latter is presented
as "for export, not for storage you intend to re-load and trust".write_manifest() gains a format = c("rds", "both", "csv")
argument. "rds" (default) preserves the current bit-exact binary
behaviour. "both" writes RDS sidecars plus parallel CSV inspection
files (*_inspection.csv); the SHA-256 hash stays bound to the RDS.
"csv" writes CSV-only sidecars for inspection / interchange
workflows where bit-exact integrity is not required.format on pesto_ensemble_manifest records the
on-disk serialisation mode (default "rds"; preserved through
read/write round-trips). Validator enforces the three-value vocabulary.read_manifest() dispatches on file extension in the YAML's
artefacts: block — reads RDS via readRDS(), CSV via
utils::read.csv().verify_manifest() gains a message field on its return list and
returns ok = NA (with explanation) for format = "csv" manifests
whose IEEE 754 doubles have round-tripped through a write formatter.
Existing format = "rds" callers see no behaviour change; the new
field is NULL in that case.format: key and an optional
inspection_csv: block when format = "both". Backwards-compatible:
YAMLs written by PESTO 0.3.0 (no format: key) read back with
format = "rds" per the default. No schema-version bump required.test-manifest.R cover all three formats plus the
unknown-format rejection path.pesto_ensemble_manifest — versioned, hashed,
provenance-tracked container for ensemble-run output. Slots cover
params, outputs, weights, obs_target, seed, data_hash
(SHA-256), fidelity, apsim_version, pesto_version, timestamp,
plus method context (method, noptmax, lambda_schedule,
failure_rate). This is the contract object that downstream
consumers (kernR, proxymix, paper-skill) will read.as_manifest() — S7 generic with a method for
pesto_ies_callback_result. Non-destructive: wraps without mutating
the source result.write_manifest() / read_manifest() — YAML+RDS serialisation.
The YAML carries metadata + relative paths to three sidecar RDS files
(*_params.rds, *_outputs.rds, *_assim.rds); RDS is used in
preference to CSV so IEEE 754 doubles round-trip bit-exactly (the
SHA-256 integrity check would otherwise trip on CSV formatter
precision loss).verify_manifest() — recomputes the SHA-256 over
(params, outputs, weights, obs_target, seed) and compares to the
stored value, returning a diagnostic list. Detects post-write
tampering with the sidecar CSVs.ensemble-manifest — end-to-end demo of construct →
write → read → verify, plus tamper-detection.pesto_ies_callback() — drives an Iterative Ensemble Smoother
entirely in R using a user-supplied forward-model callable, bypassing
the .pst-file write/read cycle of pesto_ies(). Each iteration calls
the existing C++ kernel ensemble_solution() (Chen & Oliver, 2013).
Tolerates per-realisation failures via on_failure = c("na", "stop")
and reports a failure_rate in the result object. Phase-1 behaviour
uses a single lambda per iteration (or user-supplied schedule); a
full pestpp-ies-style lambda line-search is a planned Phase-2
enhancement.apsim_callback() — adapter that wraps the apsimx package (now
in Suggests) into a forward-model closure suitable for
pesto_ies_callback(). Per-realisation template copy, parameter edit
via apsimx::edit_apsimx() / edit_apsim(), run, and extraction.
Failures (edit / run / extractor) surface as NA rows for the IES
driver to handle.apsim-callback — synthetic linear-Gaussian recovery
demo plus disabled apsimx example.inst/benchmarks/d4_callback_vs_pst.R.pesto_ies_callback() records obs_target, obs_sd, and weights
on its result list so the manifest emitter has full IES context to
capture..pst path) is
deferred to the §D1 scenario library landing. The current benchmark
script measures the callback path on a synthetic surrogate forward
model only.apsimx is in Suggests not Imports; apsim_callback() checks
for it at call time with requireNamespace().S7 (>= 0.2.0), yaml (>= 2.3.0),
digest (>= 0.6.0).ensemble_solution() sign-convention bug. The C++ kernel
requires obs_resid = sim - obs; the docstring previously stated
the inverse. Two genuine in-package call sites were silently
inverting upgrades: src/surrogate_ies.cpp:347 and
vignettes/surrogate-ies.Rmd:148, 181. Both fixed; surrogate-IES
now applies upgrades in the correct direction. Regression test
tests/testthat/test-ensemble-solution-sign.R asserts strict
monotone phi descent under the correct convention AND geometric
divergence under the inverted one.pesto_reference_ies() — pure-R, textbook implementation of the
Chen & Oliver (2013) eq. 12 IES update. Independent of the C++
kernel; used as the canonical comparison target by the comparison
vignette so it ships and runs without the upstream pestpp-ies
binary. Cross-validated against the C++ kernel at machine precision
(max element-wise delta = 5.8e-15).check_surrogate_regime() — soft guardrail that warns when the
surrogate-IES regime is unfavourable (n_train < threshold * n_params).
Stand-alone helper, not auto-invoked by pesto_surrogate_ies()
(v0.3 wiring candidate).plot_identifiability() gains a jacobian = NULL matrix-input
path. Backward-compatible: jco_file = NULL retained; the two are
mutually exclusive.vignettes/pestpp-comparison-and-simulation.Rmd now compares PESTO
native IES against pesto_reference_ies() by default — no upstream
binary required. The pure-R reference cache ships at
inst/extdata/pestpp_cache/scenario_a_reference.rds (SHA-256-pinned
to the prior ensemble). When the developer-side cache
tools/pestpp_benchmark/scenario_a_pestpp_ies.rds is present and
PESTO_PESTPP_BIN resolves, the vignette extends the agreement
plot with the live binary's posterior./Users/a1222812/... path replaced with
Sys.getenv("PESTO_PESTPP_BIN") + Sys.which("pestpp-ies")
fallback.tools/pestpp_benchmark/run_benchmark.R regenerates both
caches deterministically. Documented in CONTRIBUTING.md.@examples block (30 of 30
documented exports/methods). The four external-binary runners
(pesto_ies, pesto_glm, pesto_sweep, pesto_sensitivity) and
pesto_surrogate_ies use guarded \donttest{} (no \dontrun{}).surrogate-ies.Rmd) and an "Honest reading — surrogate savings in
this regime" defence paragraph (pestpp-comparison-and-simulation.Rmd
Section 3) covering the curse-of-dimensionality finding from
investigation I3.?ensemble_solution now states the sim - obs
convention with a full GLM-derivation rationale.cran-comments.md with per-NOTE justification.CITATION.cff (CFF 1.2.0 + ORCID + preferred citation).codemeta.json (CodeMeta 2.0).CONTRIBUTING.md documenting the developer benchmark workflow.inst/WORDLIST with ~120 domain terms; Language: en-AU
added to DESCRIPTION.src/Makevars: PKG_LIBS gains $(FLIBS) (CRAN portability
requirement for $(BLAS_LIBS)).LICENSE renamed to LICENSE.md and .Rbuildignore-d (CRAN
convention for GPL-licensed packages).check_surrogate_regime() helper exported.aut, cre, and cph.
Supremum Consulting Ltd. removed from Authors@R (administrative
consolidation by sole director; no licence change).LICENSE file rewritten with corrected canonical wording and copyright
attribution.src/ updated to reflect sole authorship.ensemble_solution() — High-performance C++ implementation of the IES
ensemble update equation (Chen & Oliver, 2013) via RcppEigen.ensemble_solution_mda() — Multiple Data Assimilation (Evensen, 2018)
update kernel.compute_phi() — Fast weighted sum-of-squares objective function.adaptive_svd() — Automatic SVD backend selection (LAPACK, Eigen BDCSVD,
or randomised SVD) based on matrix size and target rank.rsvd() — Randomised SVD (Halko-Martinsson-Tropp, 2011) for asymptotically
faster rank-k approximations.accelerate_svd() — Direct LAPACK SVD leveraging platform-optimised BLAS
(Apple Accelerate/AMX on macOS, MKL or OpenBLAS on Linux).ensemble_solution_gpu() — GPU-ready ensemble solution with adaptive SVD
backend and performance diagnostics.train_gp_surrogate() — Gaussian Process surrogate model training with
automatic hyperparameter selection (median heuristic).predict_gp_surrogate() — GP prediction with uncertainty quantification.surrogate_ensemble_update() — Surrogate-accelerated IES update with
adaptive model/surrogate switching and control-variate bias correction.adaptive_ensemble_size() — Convergence-aware dynamic ensemble sizing
based on ESS and coefficient of variation diagnostics.read_pst() / write_pst() — PEST control file I/O.read_ensemble() / write_ensemble() — Ensemble file I/O (CSV + binary).pesto_ies(), pesto_glm(), pesto_sweep(), pesto_sensitivity() —
High-level wrappers for PEST++ executables.create_pest_scenario() — Programmatic scenario builder.plot_phi() — Objective function convergence plotting.plot_ensemble() — Prior/posterior parameter distribution comparison.plot_identifiability() — SVD-based parameter identifiability analysis.plot_surrogate_diagnostics() — Surrogate IES performance visualisation.