| Title: | Methods for Optimizing Scales of Effect |
|---|---|
| Description: | A tool for optimizing scales of effect when modeling ecological processes in space. Specifically, the scale parameter of a distance-weighted kernel distribution is identified for all environmental layers included in the model. Includes functions to assist in model selection, model evaluation, efficient transformation of raster surfaces using fast Fourier transformation, and projecting models. For more details see Peterman (2026) <doi:10.1007/s10980-025-02267-x>. |
| Authors: | Bill Peterman [aut, cre]
|
| Maintainer: | Bill Peterman <[email protected]> |
| License: | GPL-3 |
| Version: | 0.6.30 |
| Built: | 2026-06-02 21:56:09 UTC |
| Source: | https://github.com/wpeterman/multiscaler |
This package is for optimizing scales of effect when modeling ecological processes in space. Specifically, the scale parameter of a distance-weighted kernel distribution is identified for all environmental layers included in the model.
Maintainer: Bill Peterman [email protected] (ORCID)
Authors:
Bill Peterman [email protected] (ORCID)
Useful links:
Report bugs at https://github.com/wpeterman/multiScaleR/issues
Creates a model selection table ranked by AIC or AICc for a
list of fitted models. Mixed lists containing both multiScaleR
objects and plain model objects (e.g., glm) are supported, allowing
comparison of scale-optimized models against fixed-scale alternatives.
aic_tab(mod_list, AICc = TRUE, mod_names = NULL, verbose = FALSE, ...)aic_tab(mod_list, AICc = TRUE, mod_names = NULL, verbose = FALSE, ...)
mod_list |
List containing fitted |
AICc |
Logical. If |
mod_names |
Optional character vector of model names. Length must equal
|
verbose |
Logical. If |
... |
Additional arguments (not used). |
The table is constructed using aictabCustom from
the AICcmodavg package. For multiScaleR objects, the optimized
(refitted) model stored in opt_mod is used for log-likelihood and
parameter count. The sigma parameter (and shape for "expow") is added
to K automatically. For plain model objects in a mixed list, sigma is not
added to K.
A data frame of class "aictab" (from the AICcmodavg
package) sorted by ascending AIC(c), containing:
ModnamesModel name (from mod_names or auto-generated).
KNumber of estimated parameters (regression coefficients +
sigma, + shape for "expow" kernel).
AICc or AIC
Information criterion value.
Delta_AICc or Delta_AIC
Difference from the top model.
ModelLikRelative likelihood (exp(-0.5 * Delta)).
AICcWt or AICWt
Akaike weight.
LLLog-likelihood.
Cum.WtCumulative Akaike weight.
Bill Peterman
## Simulate data set.seed(555) points <- vect(cbind(c(5,7,9,11,13), c(13,11,9,7,5))) mat_list <- list(r1 = rast(matrix(rnorm(20^2), nrow = 20)), r2 = rast(matrix(rnorm(20^2), nrow = 20))) rast_stack <- rast(mat_list) kernel_inputs <- kernel_prep(pts = points, raster_stack = rast_stack, max_D = 5, kernel = 'gaussian', sigma = NULL) ## Example response data y <- rnorm(5) ## Create data frame with raster variables dat <- data.frame(y = y, kernel_inputs$kernel_dat) mod1 <- glm(y ~ r1, data = dat) mod2 <- glm(y ~ r2, data = dat) mod3 <- glm(y ~ r1 + r2, data = dat) ## NOTE: This code is only for demonstration ## Optimization results will have no meaning opt_mod1 <- multiScale_optim(fitted_mod = mod1, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) opt_mod2 <- multiScale_optim(fitted_mod = mod2, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) opt_mod3 <- multiScale_optim(fitted_mod = mod3, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) ## AIC table mod_list <- list(opt_mod1, opt_mod2, opt_mod3) aic_tab(mod_list = mod_list, AICc = FALSE) ## AICc table with specified names aic_tab(mod_list = mod_list, AICc = TRUE, mod_names = c('mod1', 'mod2', 'mod3'))## Simulate data set.seed(555) points <- vect(cbind(c(5,7,9,11,13), c(13,11,9,7,5))) mat_list <- list(r1 = rast(matrix(rnorm(20^2), nrow = 20)), r2 = rast(matrix(rnorm(20^2), nrow = 20))) rast_stack <- rast(mat_list) kernel_inputs <- kernel_prep(pts = points, raster_stack = rast_stack, max_D = 5, kernel = 'gaussian', sigma = NULL) ## Example response data y <- rnorm(5) ## Create data frame with raster variables dat <- data.frame(y = y, kernel_inputs$kernel_dat) mod1 <- glm(y ~ r1, data = dat) mod2 <- glm(y ~ r2, data = dat) mod3 <- glm(y ~ r1 + r2, data = dat) ## NOTE: This code is only for demonstration ## Optimization results will have no meaning opt_mod1 <- multiScale_optim(fitted_mod = mod1, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) opt_mod2 <- multiScale_optim(fitted_mod = mod2, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) opt_mod3 <- multiScale_optim(fitted_mod = mod3, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) ## AIC table mod_list <- list(opt_mod1, opt_mod2, opt_mod3) aic_tab(mod_list = mod_list, AICc = FALSE) ## AICc table with specified names aic_tab(mod_list = mod_list, AICc = TRUE, mod_names = c('mod1', 'mod2', 'mod3'))
Creates a model selection table ranked by BIC for a list of
fitted models. Mixed lists containing both multiScaleR objects and
plain model objects are supported.
bic_tab(mod_list, mod_names = NULL, verbose = FALSE, ...)bic_tab(mod_list, mod_names = NULL, verbose = FALSE, ...)
mod_list |
List containing fitted |
mod_names |
Optional character vector of model names. Length must equal
|
verbose |
Logical. If |
... |
Additional arguments (not used). |
The table is constructed using bictabCustom from
the AICcmodavg package. BIC penalizes model complexity more heavily
than AIC as sample size grows, and is often preferred when the goal is
identifying the single best-supported model rather than model averaging.
Sigma (and shape for "expow") is counted in K.
A data frame of class "bictab" (from the AICcmodavg
package) sorted by ascending BIC, containing:
ModnamesModel name.
KNumber of estimated parameters (regression coefficients +
sigma, + shape for "expow" kernel).
BICBayesian information criterion value.
Delta_BICDifference from the top model.
BICWtBIC weight.
Cum.WtCumulative BIC weight.
LLLog-likelihood.
Bill Peterman
## Simulate data set.seed(555) points <- vect(cbind(c(5,7,9,11,13), c(13,11,9,7,5))) mat_list <- list(r1 = rast(matrix(rnorm(20^2), nrow = 20)), r2 = rast(matrix(rnorm(20^2), nrow = 20))) rast_stack <- rast(mat_list) kernel_inputs <- kernel_prep(pts = points, raster_stack = rast_stack, max_D = 5, kernel = 'gaussian', sigma = NULL) ## Example response data y <- rnorm(5) ## Create data frame with raster variables dat <- data.frame(y = y, kernel_inputs$kernel_dat) mod1 <- glm(y ~ r1, data = dat) mod2 <- glm(y ~ r2, data = dat) mod3 <- glm(y ~ r1 + r2, data = dat) ## NOTE: This code is only for demonstration ## Optimization results will have no meaning opt_mod1 <- multiScale_optim(fitted_mod = mod1, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) opt_mod2 <- multiScale_optim(fitted_mod = mod2, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) opt_mod3 <- multiScale_optim(fitted_mod = mod3, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) ## BIC table mod_list <- list(opt_mod1, opt_mod2, opt_mod3) bic_tab(mod_list = mod_list) ## BIC table with specified names bic_tab(mod_list = mod_list, mod_names = c('mod1', 'mod2', 'mod3'))## Simulate data set.seed(555) points <- vect(cbind(c(5,7,9,11,13), c(13,11,9,7,5))) mat_list <- list(r1 = rast(matrix(rnorm(20^2), nrow = 20)), r2 = rast(matrix(rnorm(20^2), nrow = 20))) rast_stack <- rast(mat_list) kernel_inputs <- kernel_prep(pts = points, raster_stack = rast_stack, max_D = 5, kernel = 'gaussian', sigma = NULL) ## Example response data y <- rnorm(5) ## Create data frame with raster variables dat <- data.frame(y = y, kernel_inputs$kernel_dat) mod1 <- glm(y ~ r1, data = dat) mod2 <- glm(y ~ r2, data = dat) mod3 <- glm(y ~ r1 + r2, data = dat) ## NOTE: This code is only for demonstration ## Optimization results will have no meaning opt_mod1 <- multiScale_optim(fitted_mod = mod1, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) opt_mod2 <- multiScale_optim(fitted_mod = mod2, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) opt_mod3 <- multiScale_optim(fitted_mod = mod3, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) ## BIC table mod_list <- list(opt_mod1, opt_mod2, opt_mod3) bic_tab(mod_list = mod_list) ## BIC table with specified names bic_tab(mod_list = mod_list, mod_names = c('mod1', 'mod2', 'mod3'))
Example count data to be used for optimizing scales of effect
data(count_data)data(count_data)
A data frame with 75 rows and 2 columns. Data were simulated from a Poisson distribution with an intercept of 0.5, a 'hab' effect of 0.75, and scale of effect (sigma) of 75.
–> Simulated counts at spatial locations
–> Scaled and centered weighted mean values from the 'hab' raster at each of the 'pts'
Returns structured warning/diagnostic information stored on fitted
multiScaleR objects. Diagnostics are populated automatically during
multiScale_optim and flag potential issues with the
optimization result.
diagnostics(object, ...) ## S3 method for class 'multiScaleR' diagnostics(object, ...)diagnostics(object, ...) ## S3 method for class 'multiScaleR' diagnostics(object, ...)
object |
An object to inspect. Must be of class |
... |
Additional arguments passed to methods. |
All three diagnostics are evaluated automatically at the end of
multiScale_optim. Console warnings are printed when any
diagnostic is triggered. diagnostics() provides programmatic access
to the same information without re-running the model.
When max_distance is triggered, consider re-running
kernel_prep with a larger max_D value — the suggested
minimum is stored in suggested_max_D.
When sigma_precision is triggered, the variable may not have a
meaningful scale of effect, or the data may be insufficient to estimate it
precisely. Interpret affected covariates with caution.
A named list with up to three elements:
max_distanceA list describing whether the estimated scale of
effect approaches or exceeds max_D. Fields include:
triggered (logical), variables (names of affected
covariates), effective_distance (estimated 90% kernel distance
per covariate), max_D (the limit used), ratio
(max_D / effective_distance), and suggested_max_D (a
recommended minimum max_D value). Triggered when
max_D / effective_distance < 2.
sigma_precisionA list describing whether sigma estimates are
imprecise. Fields include: triggered (logical), variables
(names of affected covariates), estimate (sigma means),
se (standard errors), and se_to_mean (ratio of SE to
mean). Triggered when SE / Mean >= 0.5 for any covariate. NULL
when no precision concern was flagged.
shape_precisionIdentical structure to sigma_precision
but for the shape parameter of the exponential power kernel. NULL
unless kernel = "expow" was used and a precision concern arose.
data('pts') data('count_data') hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR')) kernel_inputs <- kernel_prep(pts = pts, raster_stack = hab, max_D = 250, kernel = 'gaussian') mod <- glm(y ~ hab, family = poisson, data = count_data) opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs) ## Access diagnostics diag <- diagnostics(opt) diag$max_distance$triggered diag$sigma_precisiondata('pts') data('count_data') hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR')) kernel_inputs <- kernel_prep(pts = pts, raster_stack = hab, max_D = 250, kernel = 'gaussian') mod <- glm(y ~ hab, family = poisson, data = count_data) opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs) ## Access diagnostics diag <- diagnostics(opt) diag$max_distance$triggered diag$sigma_precision
Estimate the memory footprint of a kernel_prep object and
provide a conservative recommendation for parallel worker counts. The helper
inspects the stored kernel inputs, optionally builds the same optimization
context used by multiScale_optim, detects the number of
physical CPU cores, and estimates a maximum n_cores value while
leaving a user-defined number of cores free.
estimate_multiscale_ram( kernel_inputs, fitted_mod = NULL, join_by = NULL, refit_fn = NULL, n_cores = NULL, PSOCK = (.Platform$OS.type != "unix"), safety_factor = 1.5, ram_fraction = 0.75, leave_free = 2L )estimate_multiscale_ram( kernel_inputs, fitted_mod = NULL, join_by = NULL, refit_fn = NULL, n_cores = NULL, PSOCK = (.Platform$OS.type != "unix"), safety_factor = 1.5, ram_fraction = 0.75, leave_free = 2L )
kernel_inputs |
A |
fitted_mod |
Optional fitted model object. When supplied, the helper
also measures the optimization context that would be serialized to
parallel workers during |
join_by |
Optional data frame passed through to
|
refit_fn |
Optional custom refit function passed through to
|
n_cores |
Optional positive integer. When supplied, the helper also reports the estimated peak memory for that specific worker count. When omitted, the reported peak estimate uses the recommended maximum worker count. |
PSOCK |
Logical. Should parallel runs be budgeted as PSOCK workers.
Default: |
safety_factor |
Numeric scalar |
ram_fraction |
Numeric scalar in |
leave_free |
Integer scalar |
The estimate is advisory rather than exact. It is based on
utils::object.size() and therefore reflects
the size of stored R objects, not all transient allocations made by
exactextractr, model refits, or BLAS/LAPACK.
The recommendation is intentionally conservative. Even on Unix-like systems where forking can share memory copy-on-write, worker counts are budgeted as though each worker may require its own copy of the optimization payload.
If fitted_mod is omitted, the RAM recommendation is based on
kernel_inputs alone and may understate the true optimization payload.
A list of class "multiScaleR_ram" with:
requested_n_coresWorker count used for the
peak_parallel_estimate.
recommended_n_coresConservative recommended maximum worker count after applying both CPU and RAM limits.
max_cores_by_cpuMaximum worker count allowed by detected
physical CPU cores after leaving leave_free cores unused.
max_cores_by_ramMaximum worker count allowed by the RAM
budget, or NA if total system RAM could not be detected.
backendCharacter string identifying the assumed parallel
backend budget: "PSOCK" or "fork".
systemNamed list describing detected physical cores, logical cores, total RAM, usable RAM, and the reservation settings.
point_summaryOne-row data frame summarizing the number of points, extracted cells per point, and source raster layer count.
component_bytesData frame reporting estimated byte sizes for the major stored objects and worker bundles.
notesCharacter vector with interpretation notes and any detection warnings.
library(terra) pts <- vect(cbind(c(3, 5, 7), c(7, 5, 3))) r <- rast(matrix(rnorm(100), nrow = 10)) names(r) <- "hab" kernel_inputs <- kernel_prep( pts = pts, raster_stack = r, max_D = 2, kernel = "gaussian", verbose = FALSE ) estimate_multiscale_ram(kernel_inputs)library(terra) pts <- vect(cbind(c(3, 5, 7), c(7, 5, 3))) r <- rast(matrix(rnorm(100), nrow = 10)) names(r) <- "hab" kernel_inputs <- kernel_prep( pts = pts, raster_stack = r, max_D = 2, kernel = "gaussian", verbose = FALSE ) estimate_multiscale_ram(kernel_inputs)
Example habitat raster for optimizing scales of effect
A binary SpatRaster object
–> A binary raster
hab <- terra::rast(system.file("extdata", "hab.tif", package = 'multiScaleR'))hab <- terra::rast(system.file("extdata", "hab.tif", package = 'multiScaleR'))
Estimates the distance at which a specified cumulative proportion
of the kernel density function is reached. Can be used with a fitted
multiScaleR object to report optimized scale distances, or with
manually supplied kernel parameters to explore kernel behavior.
kernel_dist(model, prob = 0.9, ...)kernel_dist(model, prob = 0.9, ...)
model |
A fitted |
prob |
Numeric between 0 and 1 (exclusive). Cumulative kernel density
threshold used to define the effective distance. Default: |
... |
Additional parameters used when
|
The effective distance depends on both the kernel type and the scale parameter
sigma:
Gaussian: uses the inverse normal CDF, so the 90% distance is approximately 1.65 sigma.
Negative exponential: uses -sigma * log(1 - prob).
Fixed buffer: returns sigma * prob (the fraction of
the buffer radius).
Exponential power: integrates the density numerically; both
sigma and beta (shape) must be specified.
Confidence intervals for the fitted model case are derived from the
Hessian-based standard errors (or profile-likelihood intervals when
profile_sigma has been run and stored on the object).
When model is provided: a data frame with one row per optimized
covariate and three columns — Mean (distance at the estimated sigma),
Lower (distance at the lower 95% CI of sigma), and Upper
(distance at the upper 95% CI of sigma). Values are rounded to two decimal
places.
When kernel parameters are supplied directly via ...: a single numeric
value giving the distance at which the cumulative kernel density first reaches
prob.
## Using package data data('pts') data('count_data') hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR')) kernel_inputs <- kernel_prep(pts = pts, raster_stack = hab, max_D = 250, kernel = 'gaussian') mod <- glm(y ~ hab, family = poisson, data = count_data) ## Optimize scale opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs) ## Uses of `kernel_dist` kernel_dist(model = opt) kernel_dist(model = opt, prob = 0.95) kernel_dist(sigma = 500, kernel = 'gaussian', prob = 0.95) kernel_dist(sigma = 100, prob = 0.975, kernel = "exp") kernel_dist(sigma = 100, prob = 0.95, kernel = "expow", beta = 1.5) kernel_dist(sigma = 100, kernel = "fixed")## Using package data data('pts') data('count_data') hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR')) kernel_inputs <- kernel_prep(pts = pts, raster_stack = hab, max_D = 250, kernel = 'gaussian') mod <- glm(y ~ hab, family = poisson, data = count_data) ## Optimize scale opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs) ## Uses of `kernel_dist` kernel_dist(model = opt) kernel_dist(model = opt, prob = 0.95) kernel_dist(sigma = 500, kernel = 'gaussian', prob = 0.95) kernel_dist(sigma = 100, prob = 0.975, kernel = "exp") kernel_dist(sigma = 100, prob = 0.95, kernel = "expow", beta = 1.5) kernel_dist(sigma = 100, kernel = "fixed")
Prepares the data inputs required for multiscale kernel
optimization. Extracts raster values within a buffer around each point,
computes pairwise distances, and calculates initial kernel-weighted covariate
values. The result is passed directly to multiScale_optim.
kernel_prep( pts, raster_stack, max_D, kernel = c("gaussian", "exp", "expow", "fixed"), scale_vars = NULL, sigma = NULL, shape = NULL, projected = TRUE, progress = FALSE, verbose = TRUE )kernel_prep( pts, raster_stack, max_D, kernel = c("gaussian", "exp", "expow", "fixed"), scale_vars = NULL, sigma = NULL, shape = NULL, projected = TRUE, progress = FALSE, verbose = TRUE )
pts |
Spatial point locations as a |
raster_stack |
One or more raster layers of class |
max_D |
Positive numeric. The maximum radius (in the same units as the
projection of |
kernel |
Kernel function to be used for weighting raster values by distance. One of:
|
scale_vars |
Optional variable specifications created with
|
sigma |
Optional numeric vector of initial sigma values for
optimization. Length must equal the number of covariates with
|
shape |
Optional numeric vector of initial shape values when
|
projected |
Logical. Whether |
progress |
Logical. Print progress bars to the console during extraction
and distance calculations. Default: |
verbose |
Logical. Print status messages during preparation. Default:
|
Point locations and raster layers must share a defined CRS and both must be
projected. All units (including max_D and any user-supplied
sigma) must be in the same linear unit as the projection (typically
metres or feet).
The max_D buffer should be large enough to encompass the plausible
range of the true scale of effect. A common rule of thumb is to set
max_D to at least 2–3 times the largest expected sigma. After running
multiScale_optim, the max_distance diagnostic will warn
if the estimated scale approaches max_D.
Initial sigma values do not need to be precise — the optimizer will
refine them. Provide explicit starting values only if the default
(max_D / 2) leads to convergence problems.
Row names from pts are preserved throughout the returned object so
that downstream model data frames can be joined back to the original point
order.
A list of class "multiScaleR_data" containing:
kernel_datData frame of scaled (mean-centered, unit-variance)
initial kernel-weighted covariate values, one row per point and one column
per covariate. Row names match pts row names or sequential integers.
Use this directly to fit the initial model passed to
multiScale_optim.
d_listNamed list (one element per point) of numeric distance vectors from the point to every raster cell within the buffer.
raw_covNamed list (one element per point) of sparse matrices
containing raw raster cell values within the buffer, aligned with
d_list.
kernelCharacter string identifying the kernel used.
sigmaNumeric vector of initial sigma values on the internal (scaled) parameter space.
shapeNumeric vector of initial shape values, or NULL.
min_DNumeric. Approximate minimum inter-cell distance, used as the lower bound for sigma during optimization.
max_DNumeric. The max_D value supplied by the user.
n_covsInteger. Number of covariates with scale being optimized.
unit_convNumeric. Internal distance scaling factor (equals
max_D).
scale_varsData frame of class "multiScaleR_vars"
describing all covariate specifications.
resolutionNumeric. Raster cell resolution.
n_colsInteger. Number of raster columns (used for adjacency landscape metrics).
scl_paramsNamed list with elements mean and sd
— the centering and scaling parameters applied to kernel_dat.
Stored for use by kernel_scale.raster when
scale_center = TRUE.
library(terra) pts <- vect(cbind(c(3,5,7), c(7,5,3))) mat_list <- list(r1 = rast(matrix(rnorm(100), nrow = 10)), r2 = rast(matrix(rnorm(100), nrow = 10))) rast_stack <- rast(mat_list) kernel_inputs <- kernel_prep(pts = pts, raster_stack = rast_stack, max_D = 2, kernel = 'gaussian', sigma = NULL)library(terra) pts <- vect(cbind(c(3,5,7), c(7,5,3))) mat_list <- list(r1 = rast(matrix(rnorm(100), nrow = 10)), r2 = rast(matrix(rnorm(100), nrow = 10))) rast_stack <- rast(mat_list) kernel_inputs <- kernel_prep(pts = pts, raster_stack = rast_stack, max_D = 2, kernel = 'gaussian', sigma = NULL)
Applies a kernel smoothing function to one or more raster
layers, producing a spatially weighted mean (or landscape metric) at the
scale identified by multiScale_optim. Primarily used to
generate prediction rasters for spatial model projection.
kernel_scale.raster( raster_stack, sigma = NULL, multiScaleR = NULL, shape = NULL, kernel = c("gaussian", "exp", "expow", "fixed"), scale_vars = NULL, pct_wt = 0.975, fft = TRUE, scale_center = FALSE, clamp = FALSE, pct_mx = 0, na.rm = TRUE, verbose = TRUE, ... )kernel_scale.raster( raster_stack, sigma = NULL, multiScaleR = NULL, shape = NULL, kernel = c("gaussian", "exp", "expow", "fixed"), scale_vars = NULL, pct_wt = 0.975, fft = TRUE, scale_center = FALSE, clamp = FALSE, pct_mx = 0, na.rm = TRUE, verbose = TRUE, ... )
raster_stack |
A |
sigma |
Numeric vector of kernel scale parameter(s) in the same units
as the raster projection (e.g., metres). One value per raster layer.
Ignored when |
multiScaleR |
A fitted object of class |
shape |
Numeric vector of shape parameters for the exponential power
kernel ( |
kernel |
Character. Kernel function used for smoothing. One of
|
scale_vars |
Optional variable specifications created with
|
pct_wt |
Numeric between 0 and 1 (exclusive). Cumulative kernel density
cutoff used to determine the focal window size for smoothing. A larger
value (e.g., |
fft |
Logical. If |
scale_center |
Logical. If |
clamp |
Logical. If |
pct_mx |
Numeric between -0.99 and 0.99. When |
na.rm |
Logical. If |
verbose |
Logical. Print layer-level progress messages. Default:
|
... |
Reserved for deprecated arguments. Currently only |
Typical usage after running multiScale_optim:
opt_hab <- kernel_scale.raster(raster_stack = hab, multiScaleR = opt)
plot(c(hab, opt_hab))
## Scale and center for prediction
opt_hab_sc <- kernel_scale.raster(raster_stack = hab,
multiScaleR = opt,
scale_center = TRUE)
preds <- terra::predict(opt_hab_sc, opt$opt_mod, type = "response")
FFT vs. focal smoothing
The FFT convolution (fft = TRUE) is substantially faster for large
rasters or wide kernels. It produces minor edge artefacts near raster
boundaries — typically within one kernel-width of the edge. For analyses
focused on interior areas this is usually negligible. Use
fft = FALSE for exact focal smoothing when edge accuracy matters.
Dummy rasters for site-level covariates
When a fitted multiScaleR object is supplied, the function inspects
the model frame to find any predictors that are not represented by raster
layers (e.g., survey effort, habitat type measured in the field). These
cannot be projected spatially and are therefore assigned constant zero
rasters, which correspond to the reference value for centered and scaled
covariates. These dummy layers are required for terra::predict to
work but do not represent real spatial variation. Replace them manually for
non-zero projection scenarios. Categorical predictors are skipped with a
warning.
A SpatRaster with one layer per covariate defined in
scale_vars (or per raster layer when scale_vars is not used).
Layer names match the covariate names from the model. When a fitted
multiScaleR object is provided and the model contains site-level
predictors (i.e., predictors not derived from raster layers), constant
("dummy") raster layers filled with zeros are appended to make the output
compatible with terra::predict.
## Not Run r1 <- rast(matrix(rnorm(25^2), nrow = 25)) r1_s <- kernel_scale.raster(r1, sigma = 4, kernel = 'gaussian') plot(c(r1, r1_s))## Not Run r1 <- rast(matrix(rnorm(25^2), nrow = 25)) r1_s <- kernel_scale.raster(r1, sigma = 4, kernel = 'gaussian') plot(c(r1, r1_s))
Raster data for use with vignette example
'landscape_rast'
A spatRaster object with three surfaces:
–> A binary landscape surface with low autocorrelation
–> A continuous landscape surface with low autocorrelation
–> A continuous landscape surface with high autocorrelation
land_rast <- terra::rast(system.file("extdata", "landscape.tif", package = 'multiScaleR'))land_rast <- terra::rast(system.file("extdata", "landscape.tif", package = 'multiScaleR'))
Example count data to be used vignette document example
data(landscape_counts)data(landscape_counts)
A data frame with 100 rows and 2 columns. Data were simulated from a Poisson distribution with an intercept of 0.25; land1 effect = -0.5; site effect = 0.3; land2 effect = 0.7. True simulated Gaussian scale effects (sigma): land1 = 250; land2 = 500. For use with package vignette.
–> Simulated counts at spatial locations
–> A habitat variable measured at the site
These helpers define named model covariates as transformations of one or
more source raster layers. They are optional; if omitted,
kernel_prep preserves the default behavior where each raster
layer becomes one optimized kernel-weighted covariate with the same name.
Use msr_vars() to collect one or more kernel_var() or
landscape_var() specifications into a single object that is passed to
the scale_vars argument of kernel_prep and
kernel_scale.raster.
msr_vars(...) kernel_var(source) landscape_var(source, metric, radius = NULL, base = exp(1), classes_max = NULL)msr_vars(...) kernel_var(source) landscape_var(source, metric, radius = NULL, base = exp(1), classes_max = NULL)
... |
Named |
source |
Character scalar. Name of the source raster layer in
|
metric |
Character scalar. Landscape metric to compute within the
circular buffer for |
radius |
Optional positive numeric. Fixed buffer radius in the same
units as the projection. When |
base |
Positive numeric (not equal to 1). Logarithm base used when
computing diversity metrics ( |
classes_max |
Optional positive numeric. Maximum number of possible
patch types in the landscape, used only for relative patch richness
( |
Kernel vs. landscape covariates
kernel_var(source) defines a kernel-weighted mean of the continuous
raster values within a circular neighborhood. The neighborhood radius (sigma)
is optimized by multiScale_optim.
landscape_var(source, metric) derives a landscape ecology metric from
a categorical (or thresholded) raster layer within a circular buffer. The
buffer radius can be fixed or optimized.
Supported landscape metrics
Composition metrics (require a categorical raster):
"shdi"Shannon diversity index.
"shei"Shannon evenness index.
"sidi"Simpson diversity index.
"siei"Simpson evenness index.
"msidi"Modified Simpson diversity index.
"msiei"Modified Simpson evenness index.
"pr"Patch richness (number of distinct classes).
"prd"Patch richness density (pr per 100 ha).
"rpr"Relative patch richness (pr / classes_max * 100).
"ta"Total area of the buffer (ha).
Edge metrics (require adjacency information; cell IDs are cached internally):
"ed"Edge density (m/ha).
"te"Total edge length (m).
"lsi"Landscape shape index.
Adjacency metrics:
"ai"Aggregation index (%).
"pladj"Proportion of like adjacencies (%).
"contag"Contagion index (%).
Mixing covariate types
Multiple covariate types can be combined in one msr_vars() call. For
example, a kernel-weighted mean of forest cover and a fixed-radius edge
density metric can both be defined and passed together to
kernel_prep:
vars <- msr_vars(
forest_mean = kernel_var("forest"),
forest_ed500 = landscape_var("forest", metric = "ed", radius = 500)
)
kernel_inputs <- kernel_prep(pts, rasters, max_D = 1000,
scale_vars = vars)
Covariates with a fixed radius are computed once and not re-evaluated
during optimization, which can meaningfully reduce computation time.
msr_vars()A data frame of class "multiScaleR_vars"
with one row per covariate. Columns are covariate (the name
assigned in ...), source (source raster layer name),
type ("kernel" or "landscape"), metric
(landscape metric or NA), radius (fixed radius or
NA when optimized), optimize (logical indicating whether
the scale is estimated), base, and classes_max.
kernel_var()An internal list of class
"multiScaleR_var" representing a kernel-weighted mean covariate.
Pass one or more of these inside msr_vars().
landscape_var()An internal list of class
"multiScaleR_var" representing a landscape metric covariate.
Pass one or more of these inside msr_vars().
kernel_prep, multiScale_optim,
kernel_scale.raster
## Kernel-weighted mean only (equivalent to default behavior) vars <- msr_vars( forest_prop = kernel_var("forest") ) ## Combining kernel and landscape covariates vars <- msr_vars( forest_prop = kernel_var("forest"), forest_ed = landscape_var("forest", metric = "ed"), cover_shdi_500 = landscape_var("landcover", metric = "shdi", radius = 500) ) ## landscape_var with natural-log Shannon diversity (the default) landscape_var("landcover", metric = "shdi") ## landscape_var with log2 Shannon diversity and a fixed 250 m radius landscape_var("landcover", metric = "shdi", radius = 250, base = 2) ## Define a single optimized kernel-weighted mean covariate kv <- kernel_var("forest")## Kernel-weighted mean only (equivalent to default behavior) vars <- msr_vars( forest_prop = kernel_var("forest") ) ## Combining kernel and landscape covariates vars <- msr_vars( forest_prop = kernel_var("forest"), forest_ed = landscape_var("forest", metric = "ed"), cover_shdi_500 = landscape_var("landcover", metric = "shdi", radius = 500) ) ## landscape_var with natural-log Shannon diversity (the default) landscape_var("landcover", metric = "shdi") ## landscape_var with log2 Shannon diversity and a fixed 250 m radius landscape_var("landcover", metric = "shdi", radius = 250, base = 2) ## Define a single optimized kernel-weighted mean covariate kv <- kernel_var("forest")
Identifies the kernel scale of effect (sigma) for each raster
covariate by maximizing the log-likelihood of a fitted statistical model.
Repeatedly replaces kernel-weighted covariate values at different scales and
refits the model, using optim (or optimParallel for parallel
execution) with the L-BFGS-B algorithm.
multiScale_optim( fitted_mod, kernel_inputs, join_by = NULL, par = NULL, start_strategy = c("single", "screen"), screen_n_sigma = 5, screen_n_jitter = 6, screen_maxit = 8, screen_jitter_sd = 0.5, n_cores = NULL, PSOCK = FALSE, verbose = TRUE, refit_fn = NULL )multiScale_optim( fitted_mod, kernel_inputs, join_by = NULL, par = NULL, start_strategy = c("single", "screen"), screen_n_sigma = 5, screen_n_jitter = 6, screen_maxit = 8, screen_jitter_sd = 0.5, n_cores = NULL, PSOCK = FALSE, verbose = TRUE, refit_fn = NULL )
fitted_mod |
A fitted model object whose covariates include the
kernel-weighted variables defined in |
kernel_inputs |
A list of class |
join_by |
Default: |
par |
Optional numeric vector of starting values for the optimizer.
Values must be divided by |
start_strategy |
Character. How to choose starting values when
|
screen_n_sigma |
Positive integer. Number of log-spaced sigma values
evaluated per covariate during the prescreen when
|
screen_n_jitter |
Non-negative integer. Number of jittered candidate
starts evaluated with short screening optimizations after the marginal
prescreen. Default: |
screen_maxit |
Positive integer. Maximum iterations used for each short
screening optimization. Default: |
screen_jitter_sd |
Non-negative numeric scalar. Standard deviation of
multiplicative sigma jitter on the log scale during screening. Default:
|
n_cores |
Positive integer. Number of cores for parallel optimization
via |
PSOCK |
Logical. On Windows, a PSOCK cluster is always used. On Unix,
a FORK cluster is used by default (faster). Set |
verbose |
Logical. Print optimization status and warnings to the
console. Default: |
refit_fn |
Optional function for refitting |
Optimization approach
The optimizer uses the L-BFGS-B algorithm (bounded quasi-Newton) to
maximize the log-likelihood over sigma (and shape for "expow") while
holding all regression coefficients at their fitted values for each candidate
scale. Standard errors are Hessian-based approximations from the outer
optimization. Summary methods report profile-likelihood confidence intervals
for sigma when profile_sigma has been run on the object;
otherwise they fall back to Hessian-based intervals.
Parallel optimization
To ensure that fitted model function calls are properly serialized for
parallel workers, fit models using fully namespace-qualified functions. For
example, fit a negative binomial model as
fitted_mod <- MASS::glm.nb(y ~ x, data = df) rather than loading the
namespace implicitly.
Joining unmarked multi-year data
When using unmarked models where sites are surveyed across multiple
years but the spatial covariates are constant across years, provide a
join_by data frame to match each site's kernel-weighted covariate
value to its observations. The column name in join_by must match a
column in the data used to fit the unmarked model.
Custom refit functions
By default, multiScale_optim() refits the model using
stats::update() (standard models) or the unmarked equivalent.
When the default path is insufficient, supply refit_fn. The function
must accept arguments model, data, and context, and
return a fitted model whose log-likelihood can be extracted by
stats::logLik() or insight::get_loglikelihood().
A minimal custom refit function:
refit_fn <- function(model, data, context) {
stats::update(model, data = data)
}
For models requiring explicit call reconstruction:
refit_fn <- function(model, data, context) {
call <- model$call
call$data <- quote(data)
eval(call, envir = list(data = data), enclos = parent.frame())
}
With PSOCK clusters, refit_fn must serialize cleanly. Prefer
namespace-qualified calls (e.g., stats::update()) and avoid
closures over large local objects.
Wrapper model objects
Some packages return wrapper objects around another fitted model (e.g.,
amt::fit_clogit() wraps a survival::clogit() fit in its
model component). When possible, multiScale_optim() unwraps
these automatically. For amt::fit_clogit(), pass model = TRUE
when fitting so the nested clogit model retains its model frame.
When 'start_strategy = "screen"' and 'par' is left as 'NULL', 'multiScale_optim()' first scouts the sigma space with one-dimensional log-spaced scans, then optionally tests a few jittered starts using short, Hessian-free optimizations. The marginal sigma scans are serial, but the short screening attempts use 'n_cores' when parallel optimization is requested. These screening steps are used only to choose one starting vector for the single full optimization. On Windows, those parallel screening attempts use PSOCK workers, so the same PSOCK serialization guidance applies as in the full optimization. For reproducible screened starts, call 'set.seed()' before 'multiScale_optim()'.
A list of class "multiScaleR" containing:
scale_estData frame with one row per optimized covariate and
two columns: Mean (optimized sigma on the original projection
scale) and SE (Hessian-based standard error). Row names are
covariate names.
shape_estData frame of the same structure as
scale_est for the shape parameter, or NULL when
kernel != "expow".
optim_resultsThe raw list returned by stats::optim
or optimParallel::optimParallel, including par,
value (negative log-likelihood), hessian, and convergence
codes. Also contains par_unscale (sigma values back on the
projection scale) and hessian_unscale.
opt_modThe refitted model at the optimized sigma values.
This is the model object to use for inference, prediction, and
plot_marginal_effects.
fitted_mod_originalThe original fitted_mod passed
by the user, stored for reference.
min_DNumeric. Lower bound for sigma during optimization.
max_DNumeric. Upper bound for sigma during optimization.
kernel_inputsThe kernel_inputs list (minus
min_D and max_D, which are stored separately).
scl_paramsNamed list with mean and sd vectors
for each covariate — the centering and scaling parameters from the
optimized kernel data. Used by kernel_scale.raster when
scale_center = TRUE.
join_byThe join_by data frame, or NULL.
opt_contextInternal optimization context object storing the
model-class-specific refit logic. Retained for use by
profile_sigma.
profile_scale_estProfile-likelihood CI data frame, or
NULL until profile_sigma has been run.
diagnosticsList of diagnostic objects; see
diagnostics.
next_runList of recommended values for a follow-up fit.
Includes max_D for the next kernel_prep call,
optimized start_sigma values in map units, optional
start_shape values for kernel = "expow", and
start_par on the internal scaled parameter space expected by
multiScale_optim.
warn_messageInteger vector of triggered warning codes: 1 = max-distance, 2 = sigma precision, 3 = shape precision.
callThe matched call.
set.seed(555) points <- vect(cbind(c(5,7,9,11,13), c(13,11,9,7,5))) mat_list <- list(r1 = rast(matrix(rnorm(20^2), nrow = 20)), r2 = rast(matrix(rnorm(20^2), nrow = 20))) rast_stack <- rast(mat_list) kernel_inputs <- kernel_prep(pts = points, raster_stack = rast_stack, max_D = 5, kernel = 'gaussian', sigma = NULL) ## Example response data y <- rnorm(5) ## Create data frame with raster variables dat <- data.frame(y = y, kernel_inputs$kernel_dat) mod1 <- glm(y ~ r1 + r2, data = dat) ## NOTE: This code is only for demonstration ## Optimization results will have no meaning opt_mod <- multiScale_optim(fitted_mod = mod1, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) ## Using package data data('pts') data('count_data') hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR')) kernel_inputs <- kernel_prep(pts = pts, raster_stack = hab, max_D = 250, kernel = 'gaussian') mod <- glm(y ~ hab, family = poisson, data = count_data) ## Optimize scale opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs) ## Summary of fitted model summary(opt) ## 'True' parameter values data were simulated from: # hab scale = 75 # Intercept = 0.5, # hab slope estimate = 0.75 ## Plot and visualize kernel density plot(opt) ## Apply optimized kernel to the environmental raster opt_hab <- kernel_scale.raster(hab, multiScaleR = opt) plot(c(hab, opt_hab)) ## Project model; scale & center opt_hab.s_c <- kernel_scale.raster(raster_stack = hab, multiScaleR = opt, scale_center = TRUE) mod_pred <- predict(opt_hab.s_c, opt$opt_mod, type = 'response') plot(mod_pred) ## Custom refit hook for model classes that need explicit control. ## This example still uses glm(), but the same pattern can be used for ## classes whose default update path is not sufficient. refit_glm <- function(model, data, context) { stats::update(model, data = data) } opt_custom <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs, refit_fn = refit_glm)set.seed(555) points <- vect(cbind(c(5,7,9,11,13), c(13,11,9,7,5))) mat_list <- list(r1 = rast(matrix(rnorm(20^2), nrow = 20)), r2 = rast(matrix(rnorm(20^2), nrow = 20))) rast_stack <- rast(mat_list) kernel_inputs <- kernel_prep(pts = points, raster_stack = rast_stack, max_D = 5, kernel = 'gaussian', sigma = NULL) ## Example response data y <- rnorm(5) ## Create data frame with raster variables dat <- data.frame(y = y, kernel_inputs$kernel_dat) mod1 <- glm(y ~ r1 + r2, data = dat) ## NOTE: This code is only for demonstration ## Optimization results will have no meaning opt_mod <- multiScale_optim(fitted_mod = mod1, kernel_inputs = kernel_inputs, par = NULL, n_cores = NULL) ## Using package data data('pts') data('count_data') hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR')) kernel_inputs <- kernel_prep(pts = pts, raster_stack = hab, max_D = 250, kernel = 'gaussian') mod <- glm(y ~ hab, family = poisson, data = count_data) ## Optimize scale opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs) ## Summary of fitted model summary(opt) ## 'True' parameter values data were simulated from: # hab scale = 75 # Intercept = 0.5, # hab slope estimate = 0.75 ## Plot and visualize kernel density plot(opt) ## Apply optimized kernel to the environmental raster opt_hab <- kernel_scale.raster(hab, multiScaleR = opt) plot(c(hab, opt_hab)) ## Project model; scale & center opt_hab.s_c <- kernel_scale.raster(raster_stack = hab, multiScaleR = opt, scale_center = TRUE) mod_pred <- predict(opt_hab.s_c, opt$opt_mod, type = 'response') plot(mod_pred) ## Custom refit hook for model classes that need explicit control. ## This example still uses glm(), but the same pattern can be used for ## classes whose default update path is not sufficient. refit_glm <- function(model, data, context) { stats::update(model, data = data) } opt_custom <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs, refit_fn = refit_glm)
Visualizes the weight (density) of a kernel function as a
function of distance, without requiring a fitted multiScaleR object.
Useful for exploring how different kernel types and sigma values translate
into spatial influence zones before running multiScale_optim.
To plot the kernel from a fitted model, use plot(opt) instead.
plot_kernel( prob = 0.9, sigma, beta = NULL, kernel, scale_dist = TRUE, add_label = TRUE, ... )plot_kernel( prob = 0.9, sigma, beta = NULL, kernel, scale_dist = TRUE, add_label = TRUE, ... )
prob |
Numeric between 0 and 1 (exclusive). Cumulative density threshold
used to mark the effective distance on the plot. Default: |
sigma |
Positive numeric. The scale parameter of the kernel, in the
same units as the projection used with |
beta |
Positive numeric. Shape parameter for the exponential power
kernel ( |
kernel |
Character. The kernel function to visualize. One of:
|
scale_dist |
Logical. If |
add_label |
Logical. If |
... |
Not used. |
The x-axis range is set to cover 99.9% of the cumulative density so that
the tails of the distribution are visible. The prob marker is rounded
to the nearest 10 distance units for display purposes.
For fitted-model kernel plots (with confidence intervals), use
plot(multiScaleR_object) instead.
A ggplot2 object showing kernel weight (y-axis) as a function
of distance (x-axis). The object is printed as a side effect and returned
invisibly.
#' ## General use of plot method plot_kernel(prob = 0.95, sigma = 100, kernel = 'gaussian') plot_kernel(prob = 0.95, sigma = 100, kernel = 'exp') plot_kernel(prob = 0.95, sigma = 100, kernel = 'fixed') plot_kernel(prob = 0.95, sigma = 100, beta = 2.5, kernel = 'expow')#' ## General use of plot method plot_kernel(prob = 0.95, sigma = 100, kernel = 'gaussian') plot_kernel(prob = 0.95, sigma = 100, kernel = 'exp') plot_kernel(prob = 0.95, sigma = 100, kernel = 'fixed') plot_kernel(prob = 0.95, sigma = 100, beta = 2.5, kernel = 'expow')
Generates marginal effect plots with 95% confidence intervals for each
covariate in the optimized model stored within a multiScaleR object.
Each panel sweeps one covariate across its observed range while holding all
others at their sample mean.
plot_marginal_effects( x, ylab = "Estimated response", length.out = 100, type = "state", link = FALSE )plot_marginal_effects( x, ylab = "Estimated response", length.out = 100, type = "state", link = FALSE )
x |
A fitted |
ylab |
Character scalar. Y-axis label for all marginal effect panels.
Default: |
length.out |
Positive integer. Number of equally spaced points at which
to evaluate the marginal effect curve for each covariate. Default: |
type |
Character. Prediction type for |
link |
Logical. If |
Marginal effects are computed by constructing a newdata data frame
where the focal covariate sweeps from its minimum to maximum observed value
and all other predictors are held at their sample mean. For kernel-scaled
covariates, whose scaled mean is 0, this is equivalent to holding them at
their average landscape value.
For glm and most GLM-family models, predictions are transformed via
the inverse link function automatically (e.g., exp() for Poisson
log-link models). Confidence intervals are constructed from
predict(..., se.fit = TRUE) using .
For unmarked models, predict(mod, newdata, type = type) is
used, and lower/upper bounds come from the lower and upper
columns of the returned data frame.
For models with interaction terms, a message is printed explaining that each panel represents a conditional slice (at the mean of the interacting variable) rather than the full interaction surface.
HLfit (spaMM) and zeroinfl (pscl) models are handled with
class-specific prediction calls.
When x$opt_mod is a wrapper object around another fitted model,
plot_marginal_effects() uses the nested analysis model for predictor
discovery, data recovery, and prediction when possible.
A named list of ggplot2 objects, one per covariate. Plots are
printed as a side effect and the list is returned invisibly. Each plot
shows:
The predicted response (solid line) across the observed covariate range, with x-axis values back-transformed to the original (unscaled) units.
A shaded ribbon for the 95% confidence interval (when available
from the model's predict method).
A subtitle noting any polynomial (I(x^2)) or interaction
terms that involve the covariate.
data('pts') data('count_data') hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR')) kernel_inputs <- kernel_prep(pts = pts, raster_stack = hab, max_D = 250, kernel = 'gaussian') mod <- glm(y ~ hab, family = poisson, data = count_data) opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs) ## Default marginal effect plot on the response scale plot_marginal_effects(opt) ## Custom y-axis label plot_marginal_effects(opt, ylab = "Predicted abundance") ## Finer evaluation grid plot_marginal_effects(opt, length.out = 200)data('pts') data('count_data') hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR')) kernel_inputs <- kernel_prep(pts = pts, raster_stack = hab, max_D = 250, kernel = 'gaussian') mod <- glm(y ~ hab, family = poisson, data = count_data) opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs) ## Default marginal effect plot on the response scale plot_marginal_effects(opt) ## Custom y-axis label plot_marginal_effects(opt, ylab = "Predicted abundance") ## Finer evaluation grid plot_marginal_effects(opt, length.out = 200)
Plots the optimized kernel weight distribution for each spatial covariate in
a fitted multiScaleR object, with optional annotations showing the
effective scale of effect and its 95% confidence interval.
## S3 method for class 'multiScaleR' plot(x, ...)## S3 method for class 'multiScaleR' plot(x, ...)
x |
A fitted |
... |
Optional named arguments to customize the plot:
|
A list of ggplot2 objects (one per covariate with a
non-NaN standard error), returned invisibly. Plots are printed as
a side effect.
## Using package data data('pts') data('count_data') hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR')) kernel_inputs <- kernel_prep(pts = pts, raster_stack = hab, max_D = 250, kernel = 'gaussian') mod <- glm(y ~ hab, family = poisson, data = count_data) ## Optimize scale opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs) plot(opt) plot(opt, prob = 0.95) plot(opt, scale_dist = FALSE) plot(opt, scale_dist = TRUE, add_label = FALSE)## Using package data data('pts') data('count_data') hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR')) kernel_inputs <- kernel_prep(pts = pts, raster_stack = hab, max_D = 250, kernel = 'gaussian') mod <- glm(y ~ hab, family = poisson, data = count_data) ## Optimize scale opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs) plot(opt) plot(opt, prob = 0.95) plot(opt, scale_dist = FALSE) plot(opt, scale_dist = TRUE, add_label = FALSE)
Plots the profiled log-likelihood or AICc across sigma values for each spatial covariate. The optimized sigma is marked with a vertical red line.
## S3 method for class 'sigma_profile' plot(x, ...)## S3 method for class 'sigma_profile' plot(x, ...)
x |
An object of class |
... |
Additional arguments (not currently used). |
A list of ggplot objects (one per covariate), returned
invisibly. Plots are printed as a side effect.
Print method for objects of class multiScaleR.
## S3 method for class 'multiScaleR' print(x, ...)## S3 method for class 'multiScaleR' print(x, ...)
x |
A |
... |
Ignored |
Invisibly returns the input multiScaleR object
Print method for objects of class multiScaleR_data.
## S3 method for class 'multiScaleR_data' print(x, ...)## S3 method for class 'multiScaleR_data' print(x, ...)
x |
A |
... |
Ignored |
Invisibly returns the input multiScaleR_data object
Print method for objects of class summary_multiScaleR.
## S3 method for class 'summary_multiScaleR' print(x, ...)## S3 method for class 'summary_multiScaleR' print(x, ...)
x |
A |
... |
Ignored |
Invisibly returns the input summary_multiScaleR object
Evaluates the model log-likelihood and AICc at a series of sigma values spanning the optimization range for each spatial covariate. This provides a diagnostic view of the likelihood surface and helps assess whether the optimized sigma is at a clear minimum or on a flat plateau.
profile_sigma( x, n_pts = 10, metric = c("AICc", "LL"), verbose = TRUE, n_cores = NULL, spacing = c("log", "linear"), sigma_values = NULL, sigma_range = NULL )profile_sigma( x, n_pts = 10, metric = c("AICc", "LL"), verbose = TRUE, n_cores = NULL, spacing = c("log", "linear"), sigma_values = NULL, sigma_range = NULL )
x |
A fitted |
n_pts |
Positive integer (>= 3). Number of sigma values to evaluate
for each covariate along the profile grid. More points give a smoother
profile at the cost of additional model refits. Default: |
metric |
Character. Metric to display on the y-axis of the profile.
One of |
verbose |
Logical. Print per-covariate progress messages. Default:
|
n_cores |
Optional positive integer. Number of CPU cores to use when
profiling covariates in parallel. Parallel profiling is applied across
covariates with a PSOCK cluster. Default: |
spacing |
Character. Spacing of the automatically generated sigma grid.
|
sigma_values |
Optional positive numeric vector of sigma values (in the
original projection units) to evaluate directly. When supplied,
|
sigma_range |
Optional positive numeric vector of length 2 giving the
minimum and maximum sigma values (in projection units) for the generated
profile grid. Defaults to the optimization range stored in |
For each spatial covariate, sigma is varied across a sequence of candidate
values, while all other sigma values are held at their optimized values. By
default this is a log-spaced sequence from the minimum to maximum distance
considered during optimization. Users can instead request linear spacing with
spacing = "linear" or supply exact values with sigma_values.
At each evaluation point the model is refit and the log-likelihood extracted.
AICc is computed from the log-likelihood, the number of regression parameters
(including sigma), and the number of observations.
Log-spacing concentrates evaluation points at smaller sigma values where the likelihood surface often changes more rapidly, and spaces them out at larger sigma values where the surface tends to be flatter.
Linear spacing can be useful when the profile needs equal resolution across a
specific sigma interval. User-supplied sigma_values are sorted and
duplicate values are removed before profiling to avoid redundant refits.
A list of class sigma_profile containing:
A data frame with columns variable, sigma,
LL, and AICc.
A named numeric vector of the optimized sigma for each covariate.
The metric used for profiling.
The profile grid type: "log", "linear", or
"custom".
The sigma values evaluated for each covariate.
plot.sigma_profile, multiScale_optim
data('pts') data('count_data') hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR')) kernel_inputs <- kernel_prep(pts = pts, raster_stack = hab, max_D = 250, kernel = 'gaussian') mod <- glm(y ~ hab, family = poisson, data = count_data) opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs) ## Profile sigma prof <- profile_sigma(opt) plot(prof) ## More evaluation points prof <- profile_sigma(opt, n_pts = 20) plot(prof) ## Linearly spaced values between explicit limits prof <- profile_sigma(opt, n_pts = 8, spacing = "linear", sigma_range = c(25, 250)) plot(prof) ## User-supplied sigma values prof <- profile_sigma(opt, sigma_values = c(25, 50, 100, 150, 250)) plot(prof) ## Profile log-likelihood instead of AICc prof <- profile_sigma(opt, metric = "LL") plot(prof)data('pts') data('count_data') hab <- terra::rast(system.file('extdata', 'hab.tif', package = 'multiScaleR')) kernel_inputs <- kernel_prep(pts = pts, raster_stack = hab, max_D = 250, kernel = 'gaussian') mod <- glm(y ~ hab, family = poisson, data = count_data) opt <- multiScale_optim(fitted_mod = mod, kernel_inputs = kernel_inputs) ## Profile sigma prof <- profile_sigma(opt) plot(prof) ## More evaluation points prof <- profile_sigma(opt, n_pts = 20) plot(prof) ## Linearly spaced values between explicit limits prof <- profile_sigma(opt, n_pts = 8, spacing = "linear", sigma_range = c(25, 250)) plot(prof) ## User-supplied sigma values prof <- profile_sigma(opt, sigma_values = c(25, 50, 100, 150, 250)) plot(prof) ## Profile log-likelihood instead of AICc prof <- profile_sigma(opt, metric = "LL") plot(prof)
Example point file for optimizing scales of effect
data(pts)data(pts)
An sf class point object:
–> spatial location of points
Generates simulated response data with known kernel-weighted
landscape covariates at a controlled scale of effect. Useful for testing and
demonstrating multiScale_optim with data where the true
parameters are known.
sim_dat( alpha = 1, beta = NULL, kernel = c("gaussian", "exp", "expow", "fixed"), type = c("count", "count_nb", "occ", "gaussian"), StDev = 0.5, n_points = 50, min_D = NULL, raster_stack = NULL, sigma = NULL, shape = NULL, max_D = NULL, user_seed = NULL, ... )sim_dat( alpha = 1, beta = NULL, kernel = c("gaussian", "exp", "expow", "fixed"), type = c("count", "count_nb", "occ", "gaussian"), StDev = 0.5, n_points = 50, min_D = NULL, raster_stack = NULL, sigma = NULL, shape = NULL, max_D = NULL, user_seed = NULL, ... )
alpha |
Numeric scalar. Intercept term for the linear predictor.
Default: |
beta |
Numeric vector of slope coefficients, one per raster layer in
|
kernel |
Character. Kernel function used to weight raster values by
distance when generating the true covariate values. One of
|
type |
Character. Distribution of the simulated response variable. One of:
|
StDev |
Positive numeric. Dispersion parameter for |
n_points |
Positive integer or a |
min_D |
Positive numeric. Minimum inter-point spacing on the hexagonal
grid. If |
raster_stack |
A |
sigma |
Positive numeric vector. True kernel scale parameters, one per
raster layer. These are the values that |
shape |
Positive numeric vector. Shape parameters for the exponential
power kernel ( |
max_D |
Positive numeric. Maximum buffer radius for
|
user_seed |
Optional integer seed for reproducibility. Default:
|
... |
Additional arguments. Not currently used. |
Points are distributed on a hexagonal grid across the interior of the raster
extent (85% of the range in each direction to avoid edge effects), then
randomly subsampled to n_points. The min_D spacing controls
the grid resolution; if the grid produces fewer points than n_points,
min_D is reduced iteratively by 3% until enough points are generated.
Kernel-weighted covariate values are computed using
kernel_prep at the specified sigma (and shape)
values. These represent the "true" covariate values that the optimization
should recover.
A named list with three elements:
obsNumeric vector of length n_points containing the
simulated response values.
dfData frame with n_points rows and one column per
raster layer plus a y column. The raster columns contain the
true kernel-weighted mean values, scaled to zero mean and unit variance.
This data frame can be used to fit a model for
multiScale_optim.
ptsAn sf POINT object with n_points rows.
An obs column is appended containing the simulated response
values. Pass this to kernel_prep as the pts
argument.
rs <- sim_rast() rs <- terra::subset(rs, c(1,3)) s_dat <- sim_dat(alpha = 0.5, beta = c(0.75,-0.75), kernel = 'gaussian', sigma = c(75, 150), type = 'count', raster_stack = rs, max_D = 400) plot(s_dat$df$y ~ s_dat$df$bin1) plot(s_dat$df$y ~ s_dat$df$cont1)rs <- sim_rast() rs <- terra::subset(rs, c(1,3)) s_dat <- sim_dat(alpha = 0.5, beta = c(0.75,-0.75), kernel = 'gaussian', sigma = c(75, 150), type = 'count', raster_stack = rs, max_D = 400) plot(s_dat$df$y ~ s_dat$df$bin1) plot(s_dat$df$y ~ s_dat$df$cont1)
Generates simulated replicated detection/non-detection or count
data formatted for use with the unmarked package, with known
kernel-weighted landscape covariates at a controlled scale of effect.
sim_dat_unmarked( alpha = 1, beta = NULL, kernel = c("gaussian", "exp", "expow", "fixed"), type = c("count", "count_nb", "occ"), StDev = 0.5, n_points = 50, n_surv = 3, det = 0.5, min_D = NULL, raster_stack = NULL, sigma = NULL, shape = NULL, max_D = NULL, user_seed = NULL, ... )sim_dat_unmarked( alpha = 1, beta = NULL, kernel = c("gaussian", "exp", "expow", "fixed"), type = c("count", "count_nb", "occ"), StDev = 0.5, n_points = 50, n_surv = 3, det = 0.5, min_D = NULL, raster_stack = NULL, sigma = NULL, shape = NULL, max_D = NULL, user_seed = NULL, ... )
alpha |
Numeric scalar. Intercept for the abundance/occupancy linear
predictor. Default: |
beta |
Numeric vector of slope coefficients, one per raster layer in
|
kernel |
Character. Kernel function used to weight raster values by
distance. One of |
type |
Character. Response type to simulate. One of:
|
StDev |
Positive numeric. Dispersion (size) parameter for
|
n_points |
Positive integer. Number of spatial sample sites. Points are
placed on a hexagonal grid and randomly subsampled. Default: |
n_surv |
Positive integer. Number of repeated surveys per site, forming
the columns of the returned observation matrix. Default: |
det |
Numeric between 0 and 1. Per-survey detection probability applied
via binomial thinning of the true abundance/occupancy. Default: |
min_D |
Positive numeric. Minimum inter-point spacing on the hexagonal
grid. If |
raster_stack |
A |
sigma |
Positive numeric vector. True kernel scale parameters, one per raster layer. Must be in the same units as the raster projection. |
shape |
Positive numeric vector. Shape parameters for
|
max_D |
Positive numeric. Maximum buffer radius for
|
user_seed |
Optional integer seed for reproducibility. Default:
|
... |
Additional arguments. Not currently used. |
Sites are distributed on a hexagonal grid across the interior of the raster
extent (85% of the x/y range) and then randomly subsampled. The true
abundance or occupancy at each site is a function of the kernel-weighted
landscape covariates at the specified scale. Repeated surveys introduce
imperfect detection controlled by det.
Use the returned y matrix and df (as siteCovs) to
construct an unmarkedFrame, fit a model, and then pass both the
fitted model and fresh kernel_prep output to
multiScale_optim.
A named list with three elements:
yInteger matrix of dimensions n_points x n_surv
containing the simulated replicated observations. Pass as the y
argument when constructing an unmarkedFrame.
dfData frame with n_points rows. The y column
contains the simulated true abundance/occupancy (before detection
thinning). Remaining columns are the true kernel-weighted covariate
values, scaled to zero mean and unit variance.
ptsAn sf POINT object with n_points rows.
An obs column is appended containing the true
abundance/occupancy values. Pass to kernel_prep as
pts.
rs <- sim_rast(user_seed = 123) rs <- terra::subset(rs, c(1,3)) s_dat <- sim_dat_unmarked(alpha = 1, beta = c(0.75,-0.75), kernel = 'gaussian', sigma = c(75, 150), n_points = 75, n_surv = 5, det = 0.5, type = 'count', raster_stack = rs, max_D = 550, user_seed = 123) plot(s_dat$df$y ~ s_dat$df$bin1) plot(s_dat$df$y ~ s_dat$df$cont1) ## unmarked analysis library(unmarked) kernel_inputs <- kernel_prep(pts = s_dat$pts, raster_stack = rs, max_D = 550, kernel = 'gaus') umf <- unmarkedFramePCount(y = s_dat$y, siteCovs = kernel_inputs$kernel_dat) ## Base unmarked model mod0 <- pcount(~1 ~bin1 + cont1, data = umf, K = 100) ## `multiscale_optim` opt1 <- multiScale_optim(fitted_mod = mod0, kernel_inputs = kernel_inputs) summary(opt1)rs <- sim_rast(user_seed = 123) rs <- terra::subset(rs, c(1,3)) s_dat <- sim_dat_unmarked(alpha = 1, beta = c(0.75,-0.75), kernel = 'gaussian', sigma = c(75, 150), n_points = 75, n_surv = 5, det = 0.5, type = 'count', raster_stack = rs, max_D = 550, user_seed = 123) plot(s_dat$df$y ~ s_dat$df$bin1) plot(s_dat$df$y ~ s_dat$df$cont1) ## unmarked analysis library(unmarked) kernel_inputs <- kernel_prep(pts = s_dat$pts, raster_stack = rs, max_D = 550, kernel = 'gaus') umf <- unmarkedFramePCount(y = s_dat$y, siteCovs = kernel_inputs$kernel_dat) ## Base unmarked model mod0 <- pcount(~1 ~bin1 + cont1, data = umf, K = 100) ## `multiscale_optim` opt1 <- multiScale_optim(fitted_mod = mod0, kernel_inputs = kernel_inputs) summary(opt1)
Creates a SpatRaster stack of four simulated landscape surfaces for
use with sim_dat and sim_dat_unmarked. The stack
contains two binary (0/1) and two continuous (0–1) surfaces with differing
spatial autocorrelation ranges, allowing simulation of multiscale ecological
processes.
sim_rast( dim = 100, resolution = 10, autocorr_range1 = NULL, autocorr_range2 = NULL, sill = 10, plot = FALSE, user_seed = NULL, ... )sim_rast( dim = 100, resolution = 10, autocorr_range1 = NULL, autocorr_range2 = NULL, sill = 10, plot = FALSE, user_seed = NULL, ... )
dim |
Positive integer. Number of cells on each side of the square
raster. The output will be a |
resolution |
Positive numeric. Cell size (edge length) in map units.
The raster extent will be |
autocorr_range1 |
Optional positive numeric. Spatial autocorrelation
range (in map cells) for surfaces 1 and 3 ( |
autocorr_range2 |
Optional positive numeric. Autocorrelation range for
surfaces 2 and 4 ( |
sill |
Positive numeric. Partial sill (variance) of the Gaussian
random field used for all four surfaces. Default: |
plot |
Logical. If |
user_seed |
Optional integer seed for reproducibility. Different
transformations of |
... |
Additional arguments. Not currently used. |
Each surface is generated as a Gaussian random field using a fast Fourier transform (FFT) circulant embedding approach with an exponential covariance function. The underlying continuous fields are normalized to [0, 1] before thresholding (binary surfaces) or returning directly (continuous surfaces).
The two autocorrelation ranges allow simulation of covariates that operate at different spatial scales — a common scenario in landscape ecology where some resources are patchily distributed at fine scales and others vary broadly across the study area.
When user_seed is provided, independent but reproducible seeds are
derived for each surface as multiples of user_seed.
A SpatRaster with four named layers:
bin1Binary (0/1) surface with fine-scale autocorrelation
(autocorr_range1). Values of 1 where the underlying continuous
field exceeds its 55th percentile.
bin2Binary (0/1) surface with broad-scale autocorrelation
(autocorr_range2). Values of 1 where the underlying continuous
field falls below its 40th percentile.
cont1Continuous surface rescaled to [0, 1] with fine-scale
autocorrelation (75% of autocorr_range1).
cont2Continuous surface rescaled to [0, 1] with broad-scale
autocorrelation (125% of autocorr_range2).
All layers span the same spatial extent:
[0, dim * resolution] x [0, dim * resolution].
Summarizes output from multiScale_optim.
## S3 method for class 'multiScaleR' summary(object, profile = FALSE, ...)## S3 method for class 'multiScaleR' summary(object, profile = FALSE, ...)
object |
An object of class |
profile |
Logical. If |
... |
Optional arguments passed to the method (e.g., |
An object of class summary_multiScaleR. Confidence limits for ‘sigma' default to the package’s existing Wald-style limits. If profile = TRUE, profile likelihood is used when feasible; if profiling fails, the summary falls back to Wald-style limits.
Example point file for use with vignette document example
data(surv_pts)data(surv_pts)
An sf class point object:
–> 100 spatial point locations