Maintenance release: contract-sharpening corrections plus the documentation and metadata that were prepared for v0.4.0 but not released. No new public exports. The two behaviour changes below are deliberate fail-closed corrections to existing exports; user code that depended on the silent failure mode will need to be updated.
apply_recipe() and unmask() now error when a non-NA value is
not present in the recipe's level map. Previously the row was
silently coerced to NA, which could quietly poison downstream
model matrices. Schema drift or a new treatment level in the input
now fails closed with the offending values listed.apply_recipe() now verifies that the NA mask of original matches
the recipe's recorded integrity_fp. A mismatch errors with
guidance. New check_integrity = TRUE parameter (default) gives
an escape hatch (check_integrity = FALSE) for workflows where the
missingness has legitimately changed since the recipe was built.unmask(x, rec) now passes through atomic numeric, integer,
logical, and Date / POSIXct vectors unchanged, matching the
documented numeric pass-through contract. Previously these inputs
errored when the recipe held no level maps.audit_mask()'s exact_match_pct now divides by the number of
jointly-observed comparable cells, not by nrow(df). Columns
dominated by NAs no longer underreport leakage. The audit tibble
gains a new comparable_n column for interpretability.synthesise_geospatial() now uses original's NA mask as the
authority for cell-level preservation (previously used synth's
mask, which could let synthesised coordinates leak into rows that
the original had missing). Adds a nrow(synth) == nrow(original)
check.roles_validate() error message for the multiple-treatment case is
refreshed: drops the stale "v0.2 / deferred to v0.3" wording and
guides the user to either edit the roles tibble or call
propose_roles(df, detect = FALSE) for byte-stable v0.2.x
behaviour.mask()'s
roxygen removed.recipe_io.R doc and the recipe_anatomy vignette reword the
include_simulator = TRUE no-op without pinning it to v0.2 / v0.3.roadmap vignette restructured around feature areas. The hard
version pins ("v0.3", "v0.4") are gone — v0.3 / v0.4 shipped
different features from the prior roadmap, so the pins were stale.getting_started vignette: "vignette('roadmap') — what's planned
for v0.3+" replaced by "features deliberately deferred from the
current release".test-mask-end-to-end.R,
test-mask-roundtrip-integration.R) call
propose_roles(df, detect = FALSE) so the suite is clean against
the maintainer's local fixtures while the multi-treatment design
decision remains roadmap.expect_warning("HIGH leakage")
so future warning regressions remain visible.unmask(); fail-closed unknown-level handling in
apply_recipe() and unmask(); integrity_fp enforcement
(positive, negative, and the check_integrity = FALSE escape
hatch); synthesise_geospatial() NA-mask source authority and
row-count check.Adds first-class geospatial synthesis. One new export, no breaking changes to the v0.3.0 surface.
synthesise_geospatial(synth, original, anchor_col, lat_col, lon_col, anchor_centroids, site_spread_deg, jitter_deg, seed) —
re-anchors the latitude / longitude columns in a masqued data frame
at user-supplied centroids, while preserving (a) the count of
distinct sites per anchor level, (b) the per-site replication
distribution, and (c) within-site tight clustering with
between-site spread. The original positions are never published;
the function reads them only to count distinct sites. NA pattern
in coordinates is preserved cell-by-cell. RNG hygiene via
withr::local_preserve_seed().
Motivated by the masque release walkthrough, where state-centroid
cran-comments.md for first-submission notes..github/workflows/R-CMD-check.yaml (r-lib standard matrix:
Linux release / devel / oldrel-1, macOS release, Windows release).R CMD check --as-cran reports 0 errors, 0 warnings, 2 NOTEs
(new-submission boilerplate and local HTML Tidy environmental).R/synthesise_geospatial.R carries the full roxygen doc + a
\donttest{} example.Adds automatic experimental-design detection and a sanity-check visualisation. New public surface: 3 exports, 1 vignette.
detect_design(df, roles = NULL, interactive = FALSE, threshold = 0.5, tie_delta = 0.02) — returns an S7 design_summary with the most
likely design class (CRD, RCBD, IBD/alpha-lattice,
row-column, split-plot, factorial, or none), per-rule scores,
evidence, and a recommended_roles tibble. Rule engine, not ML.design_summary — S7 class wrapping the detection result.
print() is cli-styled and surfaces top-3 alternates so the user
can see how confident the call was. Slots include class_label,
treatment_col, block_cols, whole_plot_col, sub_plot_col,
spatial_cols, scores, evidence, recommended_roles,
candidates, warnings.plot_design_summary(x, df, engine = c("base", "ggplot2")) — also
registered as an S7 plot() method. Base-graphics sanity-check
visualisation dispatched per class: replication tile, spatial
layout, factor-nesting tree, treatment-frequency + NA-pattern.propose_roles(df) flips to detect = TRUE by default. The
detected design's recommended_roles are overlaid on the name-based
proposal, promoting structurally-identified treatments and blocks
even when their column names don't match the design / treatment
regexes (e.g., gen in an alpha-lattice). The design_summary is
stashed as attr(roles, "design"). Pass detect = FALSE to
recover the v0.2.x byte-stable behaviour.mask() synthesis behaviour is
unchanged. Only propose_roles() consumes detection output, and
only as role hints.[0, 1] with evidence; the orchestrator picks
the top above threshold, breaking ties in favour of the simpler
design (CRD < RCBD < factorial < IBD < row-column < split-plot).desplot::desplot() or ggplot2-based packages.agridat — canonical fixtures for tests and the new vignette.ggplot2 — optional plot engine via engine = "ggplot2"; base
graphics is the default and the fallback.detect = FALSE for toy fixtures.First public release of masque — a structurally faithful development
surrogate for tabular datasets. Successor to the unreleased synthPR
v0.1.0 (folder-scanning multi-file API), rewritten around a single-file
data-frame-first interface and a round-trippable recipe object.
masque is not an anonymisation or differential-privacy tool. It produces
development surrogates suitable for building and debugging pipelines, and
a private recipe that re-targets a pipeline built against the synthetic
clone back onto the original data. See vignette("confidentiality") for
the threat model.
design, treatment,
outcome, covariate, ignore. Multi-outcome supported. Date /
POSIX columns and PII-pattern column names default to ignore.local — realistic dev surrogate for the data owner. Column names
and level vocabularies preserved. Treatment-level permutation is
opt-in. Issues a load-time warning when the synthetic is extracted.collaborate — give the synthetic to a collaborator while keeping
the recipe private. Treatment + categorical-covariate levels are
opaque-aliased (trt_001, <col>_L01). Numeric draws are
jittered within column resolution; integer columns are
stochastically rounded. ignore columns are dropped.
audit_mask() runs automatically and warns on HIGH leakage.propose_roles(df) — heuristics-driven role tibble; the user edits
and passes to mask().roles_validate(roles, df) — fail-closed structural + semantic check.mask(df, roles, mode, seed, ...) — returns an S7 masque object.synthetic(m) / recipe(m) — accessors that hide S7.apply_recipe(original, recipe) — forward translate original-namespace
data into the synthetic namespace.unmask(x, recipe, column = NULL) — inverse on a data frame or atomic
vector; round-trips a pipeline back to the original.save_recipe(rec, path, include_simulator = FALSE) /
read_recipe(path) — runtime-minimal .rds persistence (under 10 KB
on a 17,000-row, 38-column MET fixture).audit_mask(m, original = NULL, print = TRUE) — first-class leakage
audit returning the per-column severity tibble.reveal_maps(recipe) — explicit, banner-fenced unmasked-map reveal
(never automatic; print(recipe) is redacted by default).withr::with_seed / local_preserve_seed);
mask() does not mutate the caller's .Random.seed.recipe is runtime-minimal by default — no copula matrix or raw
marginals stored. SHA-256 NA-mask fingerprint provided as an
integrity check, not a privacy primitive.print(recipe) redacted by default; reveal_maps() is the only
unmasked path.audit_mask() flags retained PII-pattern columns, unaliased
treatments under collaborate, rare-level leakage, and numeric exact-
match rates above the per-role thresholds.getting_started, confidentiality,
recipe_anatomy, roadmap.inst/extdata/john_alpha.csv — 72-row, 7-column public fixture
derived from agridat::john.alpha (John 1987, alpha design).Predecessor synthPR v0.1.0 (folder-scanning, multi-file) is archived
at _legacy/synthPR_v0.1.0/ in the development workspace and is not
distributed.