confusius.validation¶
validation ¶
Data validation utilities for confusius.
Modules:
-
iq–IQ data validation utilities.
-
mask–Mask validation utilities.
-
time_series–Time series validation utilities.
Functions:
-
validate_iq–Validate that a DataArray contains valid IQ data.
-
validate_labels–Validate that a label map matches data spatial dimensions and coordinates.
-
validate_mask–Validate that a mask matches data spatial dimensions and coordinates.
-
validate_time_series–Validate time series for signal processing operations.
validate_iq ¶
validate_iq(
iq: DataArray, require_attrs: bool = True
) -> None
Validate that a DataArray contains valid IQ data.
This function performs validation of an IQ DataArray to ensure it meets all requirements for processing with confusius functions. Validation checks include:
- Dimensions: The IQ DataArray must have exactly 4 dimensions in the
order:
(time, z, y, x). - Coordinates: All dimensions must have corresponding coordinates.
- Data type: The data must be complex-valued (
complex64orcomplex128). -
Attributes (optional): If
require_attrsisTrue, the DataArray must have the following attributes: -
compound_sampling_frequency: Volume acquisition rate in Hz. transmit_frequency: Ultrasound probe central frequency in Hz.sound_velocity: Speed of sound in the imaged medium in m/s.
Parameters:
-
(iq¶DataArray) –Input DataArray to validate. Must have dimensions
(time, z, y, x)and the required structure and attributes. -
(require_attrs¶bool, default:True) –Whether to validate that all required attributes (
compound_sampling_frequency,transmit_frequency,sound_velocity) are present in the DataArray attributes.
Raises:
-
ValueError–If the DataArray does not have dimensions ("time", "z", "y", "x") or corresponding coordinates, or if required attributes are missing (when
require_attrs=True). -
TypeError–If the IQ data is not complex-valued.
Examples:
Validate a properly formatted IQ DataArray:
>>> import xarray as xr
>>> import numpy as np
>>> iq = xr.DataArray(
... np.ones((10, 4, 6, 8), dtype=np.complex64),
... dims=("time", "z", "y", "x"),
... coords={
... "time": np.arange(10),
... "z": np.arange(4),
... "y": np.arange(6),
... "x": np.arange(8),
... },
... attrs={
... "compound_sampling_frequency": 1000.0,
... "transmit_frequency": 15e6,
... "sound_velocity": 1540.0,
... },
... )
>>> validate_iq(iq)
Skip attribute validation for intermediate processing:
>>> # DataArray missing attributes
>>> iq_no_attrs = xr.DataArray(
... np.ones((10, 4, 6, 8), dtype=np.complex64),
... dims=("time", "z", "y", "x"),
... coords={"time": np.arange(10), "z": np.arange(4),
... "y": np.arange(6), "x": np.arange(8)},
... )
>>> validate_iq(iq_no_attrs, require_attrs=False)
validate_labels ¶
validate_labels(
labels: DataArray,
data: DataArray,
labels_name: str = "labels",
rtol: float = 1e-05,
atol: float = 1e-08,
) -> None
Validate that a label map matches data spatial dimensions and coordinates.
Parameters:
-
(labels¶DataArray) –Label map to validate. Must have integer dtype and coordinates must match data. Accepts two formats:
- Flat label map: Spatial dims only, e.g.
(z, y, x). Background voxels labeled0; each unique non-zero integer identifies a distinct, non-overlapping region. Theregionscoordinate of the output holds the integer label values. - Stacked mask format: Has a leading
maskdimension followed by spatial dims, e.g.(mask, z, y, x). Each layer has values in{0, region_id}and regions may overlap. Theregioncoordinate of the output holds themaskcoordinate values (e.g., region label).
- Flat label map: Spatial dims only, e.g.
-
(data¶DataArray) –Data array to validate labels against.
-
(labels_name¶str, default:"labels") –Name of the labels parameter (used in error messages).
-
(rtol¶float, default:1e-5) –Relative tolerance for coordinate comparison.
-
(atol¶float, default:1e-8) –Absolute tolerance for coordinate comparison.
Raises:
-
TypeError–If
labelsis not an integer dtype DataArray. -
ValueError–If
labelsdimensions don't matchdataor if coordinates don't match.
validate_mask ¶
validate_mask(
mask: DataArray,
data: DataArray,
mask_name: str = "mask",
rtol: float = 1e-05,
atol: float = 1e-08,
) -> None
Validate that a mask matches data spatial dimensions and coordinates.
Parameters:
-
(mask¶DataArray) –Mask to validate. Must have boolean dtype, or integer dtype with exactly one non-zero value (0 = background, one region id = foreground). The latter format is produced by
Atlas.get_masks. Coordinates must match data. -
(data¶DataArray) –Data array to validate mask against.
-
(mask_name¶str, default:"mask") –Name of the mask parameter (used in error messages).
-
(rtol¶float, default:1e-5) –Relative tolerance for coordinate comparison.
-
(atol¶float, default:1e-8) –Absolute tolerance for coordinate comparison.
Raises:
-
TypeError–If
maskis not a boolean or single-label integer DataArray. -
ValueError–If
maskdimensions don't matchdataor if coordinates don't match.
validate_time_series ¶
validate_time_series(
signals: DataArray,
operation_name: str,
check_time_chunks: bool = True,
) -> int
Validate time series for signal processing operations.
Performs common validation checks:
- Signals have a
timedimension. - Time dimension has more than 1 timepoint.
- Time dimension is not chunked for Dask arrays (optional).
Parameters:
-
(signals¶DataArray) –Input signals to validate. Must have a
timedimension. -
(operation_name¶str) –Name of the operation (used in error/warning messages).
-
(check_time_chunks¶bool, default:True) –Whether to raise an error when time dimension is chunked in a Dask array. Set to
Falsefor operations that can handle chunked time (e.g.,confusius.signal.standardize).
Returns:
-
int–Axis number for the
timedimension.
Raises:
-
ValueError–If
signalshas notimedimension, if thetimedimension has only 1 timepoint, or if thetimedimension is chunked in a Dask array (whencheck_time_chunks=True).