Compare candidate POPMAPS interpolation surfaces — compare_popmaps

compare_popmaps_surfaces() runs matched tune_popmaps() validation across multiple candidate surfaces. It asks which user-supplied surface best predicts withheld empirical ancestry estimates in the POPMAPS interpolation workflow. This is predictive model selection for ancestry surfaces, not a replacement for upstream SDM, resistance-surface, EEMS/FEEMS, or landscape-genetic hypothesis testing.

Usage

compare_popmaps_surfaces(
  input_locs,
  surfaces,
  empirical_pt_dist = NULL,
  num_sites = NULL,
  num_tested = NULL,
  popmod = NULL,
  threshold = 0,
  validation = c("loo", "spatial_block"),
  n_blocks = 4,
  block_assignments = NULL,
  spatial_block_repeats = 1,
  spatial_block_seed = NULL,
  primary_metric = c("rmse", "mae", "hellinger", "dominant_accuracy",
    "dominant_axis_support", "dominant_probability"),
  dist_prob_func = function(popmod_temp, distance) {
     exp(popmod_temp * distance)
 },
  surface_grid = c("surface_specific", "shared"),
  near_best_tolerance = 0.05,
  quiet = TRUE
)

Arguments

input_locs: A data frame or matrix with sampling location name, longitude, latitude, and one or more ancestry coefficient columns.
surfaces: Named list of candidate surfaces. Each element may be a popmaps_surface object returned by prepare_popmaps_surface() or a list with input_raster or raster, surface, and optional surface_values, mask, barrier, rescale_conductance, and resistance_epsilon entries.
empirical_pt_dist: Numeric vector. Minimum distances required between empirical sites selected for a prediction. Values are kilometers for surface = "G" and least-cost distance units for surface = "C".
num_sites: Integer vector. Candidate pool sizes to evaluate.
num_tested: Integer vector. Numbers of empirical sites used to estimate ancestry coefficients.
popmod: Numeric vector. Distance-decay parameter values to evaluate.
threshold: Numeric scalar. Raster values below this threshold are not scored.
validation: Cross-validation design. "loo" withholds one site at a time. "spatial_block" withholds spatially grouped sites, which is a stricter test of prediction into undersampled regions.
n_blocks: Target number of spatial blocks when validation = "spatial_block" and block_assignments = NULL.
block_assignments: Optional vector assigning each empirical site to a spatial block. If supplied, it must have one value per row in input_locs.
spatial_block_repeats: Number of spatial-block layouts to evaluate when validation = "spatial_block" and block_assignments = NULL. Values greater than one repeat the spatial-block validation with rotated spatial partitions and report repeat-level uncertainty.
spatial_block_seed: Optional random seed for repeated spatial-block layouts. The first repeat uses the deterministic default partition; later repeats use random spatial rotations.
primary_metric: Metric used to select the best parameter combination. dominant_axis_support is the normalized predicted support for the observed dominant ancestry axis. dominant_probability is retained as a legacy alias with the same values.
dist_prob_func: Function defining the relationship between distance and empirical-site contribution.
surface_grid: Character. "surface_specific" derives popmod and empirical_pt_dist from distances measured over each candidate surface when either argument is NULL. "shared" derives missing values from geographic sampling-site distances and uses them for every surface.
near_best_tolerance: Non-negative relative tolerance for labeling surfaces as statistically near-best. The default keeps surfaces within 5% of the best primary metric score.
quiet: Logical. If FALSE, print a short completion message.

Value

A popmaps_surface_comparison object with:

summary: One row per surface, ranked by the primary metric.
best: The best-ranked surface row.
near_best: Surfaces within near_best_tolerance of the best score.
support: One-row conservative interpretation of surface support.
tunings: Named list of popmaps_tuning objects, one per surface.
grids: Named list of surface-specific tuning grids used for the comparison.

Examples

ex_raster <- terra::rast(nrows = 4, ncols = 4, xmin = 0, xmax = 4, ymin = 0, ymax = 4)
terra::values(ex_raster) <- 1
locs <- data.frame(
  site = paste0("s", 1:4),
  lon = c(0.5, 3.5, 0.5, 3.5),
  lat = c(0.5, 0.5, 3.5, 3.5),
  axis1 = c(0.9, 0.8, 0.2, 0.1),
  axis2 = c(0.1, 0.2, 0.8, 0.9)
)
surfaces <- list(
  geographic = prepare_popmaps_surface(ex_raster, surface = "G"),
  suitability = prepare_popmaps_surface(
    ex_raster,
    surface = "C",
    surface_values = "suitability"
  )
)
comparison <- compare_popmaps_surfaces(
  input_locs = locs,
  surfaces = surfaces,
  empirical_pt_dist = 0,
  num_sites = 3,
  num_tested = 2,
  popmod = -0.1,
  quiet = TRUE
)
comparison$summary
#>   surface_name surface surface_values     transform rescale_conductance
#> 1   geographic       G           <NA> geometry_only               FALSE
#> 2  suitability       C    suitability      identity               FALSE
#>   primary_metric     score distance_units num_sites num_tested popmod
#> 1           rmse 0.3444388             km         3          2   -0.1
#> 2           rmse 0.3503013  cost_distance         3          2   -0.1
#>   half_distance ten_pct_distance empirical_pt_dist n_combinations n_scored
#> 1      6.931472         23.02585                 0              1        4
#> 2      6.931472         23.02585                 0              1        4
#>   failed_folds n_validation_repeats n_validation_folds rank delta_from_best
#> 1            0                    1                  4    1     0.000000000
#> 2            0                    1                  4    2     0.005862403
#>   percent_from_best
#> 1          0.000000
#> 2          1.702016