Skip to content

confusius.validation

validation

Data validation utilities for confusius.

Modules:

  • iq

    IQ data validation utilities.

  • mask

    Mask validation utilities.

  • time_series

    Time series validation utilities.

Functions:

  • validate_iq

    Validate that a DataArray contains valid IQ data.

  • validate_labels

    Validate that a label map matches data spatial dimensions and coordinates.

  • validate_mask

    Validate that a mask matches data spatial dimensions and coordinates.

  • validate_time_series

    Validate time series for signal processing operations.

validate_iq

validate_iq(
    iq: DataArray, require_attrs: bool = True
) -> None

Validate that a DataArray contains valid IQ data.

This function performs validation of an IQ DataArray to ensure it meets all requirements for processing with confusius functions. Validation checks include:

  1. Dimensions: The IQ DataArray must have exactly 4 dimensions in the order: (time, z, y, x).
  2. Coordinates: All dimensions must have corresponding coordinates.
  3. Data type: The data must be complex-valued (complex64 or complex128).
  4. Attributes (optional): If require_attrs is True, the DataArray must have the following attributes:

  5. compound_sampling_frequency: Volume acquisition rate in Hz.

  6. transmit_frequency: Ultrasound probe central frequency in Hz.
  7. sound_velocity: Speed of sound in the imaged medium in m/s.

Parameters:

  • iq

    (DataArray) –

    Input DataArray to validate. Must have dimensions (time, z, y, x) and the required structure and attributes.

  • require_attrs

    (bool, default: True ) –

    Whether to validate that all required attributes (compound_sampling_frequency, transmit_frequency, sound_velocity) are present in the DataArray attributes.

Raises:

  • ValueError

    If the DataArray does not have dimensions ("time", "z", "y", "x") or corresponding coordinates, or if required attributes are missing (when require_attrs=True).

  • TypeError

    If the IQ data is not complex-valued.

Examples:

Validate a properly formatted IQ DataArray:

>>> import xarray as xr
>>> import numpy as np
>>> iq = xr.DataArray(
...     np.ones((10, 4, 6, 8), dtype=np.complex64),
...     dims=("time", "z", "y", "x"),
...     coords={
...         "time": np.arange(10),
...         "z": np.arange(4),
...         "y": np.arange(6),
...         "x": np.arange(8),
...     },
...     attrs={
...         "compound_sampling_frequency": 1000.0,
...         "transmit_frequency": 15e6,
...         "sound_velocity": 1540.0,
...     },
... )
>>> validate_iq(iq)

Skip attribute validation for intermediate processing:

>>> # DataArray missing attributes
>>> iq_no_attrs = xr.DataArray(
...     np.ones((10, 4, 6, 8), dtype=np.complex64),
...     dims=("time", "z", "y", "x"),
...     coords={"time": np.arange(10), "z": np.arange(4),
...             "y": np.arange(6), "x": np.arange(8)},
... )
>>> validate_iq(iq_no_attrs, require_attrs=False)

validate_labels

validate_labels(
    labels: DataArray,
    data: DataArray,
    labels_name: str = "labels",
    rtol: float = 1e-05,
    atol: float = 1e-08,
) -> None

Validate that a label map matches data spatial dimensions and coordinates.

Parameters:

  • labels

    (DataArray) –

    Label map to validate. Must have integer dtype and coordinates must match data. Accepts two formats:

    • Flat label map: Spatial dims only, e.g. (z, y, x). Background voxels labeled 0; each unique non-zero integer identifies a distinct, non-overlapping region. The regions coordinate of the output holds the integer label values.
    • Stacked mask format: Has a leading mask dimension followed by spatial dims, e.g. (mask, z, y, x). Each layer has values in {0, region_id} and regions may overlap. The region coordinate of the output holds the mask coordinate values (e.g., region label).
  • data

    (DataArray) –

    Data array to validate labels against.

  • labels_name

    (str, default: "labels" ) –

    Name of the labels parameter (used in error messages).

  • rtol

    (float, default: 1e-5 ) –

    Relative tolerance for coordinate comparison.

  • atol

    (float, default: 1e-8 ) –

    Absolute tolerance for coordinate comparison.

Raises:

  • TypeError

    If labels is not an integer dtype DataArray.

  • ValueError

    If labels dimensions don't match data or if coordinates don't match.

validate_mask

validate_mask(
    mask: DataArray,
    data: DataArray,
    mask_name: str = "mask",
    rtol: float = 1e-05,
    atol: float = 1e-08,
) -> None

Validate that a mask matches data spatial dimensions and coordinates.

Parameters:

  • mask

    (DataArray) –

    Mask to validate. Must have boolean dtype, or integer dtype with exactly one non-zero value (0 = background, one region id = foreground). The latter format is produced by Atlas.get_masks. Coordinates must match data.

  • data

    (DataArray) –

    Data array to validate mask against.

  • mask_name

    (str, default: "mask" ) –

    Name of the mask parameter (used in error messages).

  • rtol

    (float, default: 1e-5 ) –

    Relative tolerance for coordinate comparison.

  • atol

    (float, default: 1e-8 ) –

    Absolute tolerance for coordinate comparison.

Raises:

  • TypeError

    If mask is not a boolean or single-label integer DataArray.

  • ValueError

    If mask dimensions don't match data or if coordinates don't match.

validate_time_series

validate_time_series(
    signals: DataArray,
    operation_name: str,
    check_time_chunks: bool = True,
) -> int

Validate time series for signal processing operations.

Performs common validation checks:

  1. Signals have a time dimension.
  2. Time dimension has more than 1 timepoint.
  3. Time dimension is not chunked for Dask arrays (optional).

Parameters:

  • signals

    (DataArray) –

    Input signals to validate. Must have a time dimension.

  • operation_name

    (str) –

    Name of the operation (used in error/warning messages).

  • check_time_chunks

    (bool, default: True ) –

    Whether to raise an error when time dimension is chunked in a Dask array. Set to False for operations that can handle chunked time (e.g., confusius.signal.standardize).

Returns:

  • int

    Axis number for the time dimension.

Raises:

  • ValueError

    If signals has no time dimension, if the time dimension has only 1 timepoint, or if the time dimension is chunked in a Dask array (when check_time_chunks=True).