`confusius.extract`¶

extract ¶

Signal extraction from fUSI data.

Modules:

labels –

Extraction of region-aggregated signals using integer label maps.
mask –

Extraction of signals using boolean masks.
reconstruction –

Reconstruction of fUSI DataArrays from N-D signals using masks.

Functions:

extract_with_labels –

Extract region-aggregated signals from fUSI data using an integer label map.
extract_with_mask –

Extract signals from fUSI data using a binary mask.
unmask –

Reconstruct a fUSI DataArray from N-D signals using a mask.

extract_with_labels ¶

extract_with_labels(
    data: DataArray,
    labels: DataArray,
    reduction: Literal[
        "mean", "sum", "median", "min", "max", "var", "std"
    ] = "mean",
) -> DataArray

Extract region-aggregated signals from fUSI data using an integer label map.

For each unique non-zero label in labels, applies reduction across all voxels belonging to that region. The spatial dimensions are collapsed into a single regions dimension.

Parameters:

data ¶
(DataArray) –

Input array with spatial dimensions matching labels. Can have any number of non-spatial dimensions (e.g., time, pose). The spatial dimensions must match those in labels.
labels ¶
(DataArray) –
Integer label map in one of two formats:
- Flat label map: Spatial dims only, e.g. (z, y, x). Background voxels labeled 0; each unique non-zero integer identifies a distinct, non-overlapping region. The region coordinate of the output holds the integer label values.
- Stacked mask format: Has a leading mask dimension followed by spatial dims, e.g. (mask, z, y, x). Each layer has values in {0, region_id} and regions may overlap. The region coordinate of the output holds the mask coordinate values (e.g., region label).
reduction ¶
((mean, sum, median, min, max, var, std), default: "mean" ) –

Aggregation function applied across voxels in each region.

Returns:

DataArray –
Array with spatial dimensions replaced by a region dimension. All non-spatial dimensions are preserved.

For example (flat label map):
- (time, z, y, x) → (time, region)
- (time, pose, z, y, x) → (time, pose, region)
- (z, y, x) → (region,)

Raises:

ValueError –

If labels dimensions don't match data's spatial dimensions, if coordinates don't match, if reduction is not a valid option, or if labels contains no non-zero values.
TypeError –

If labels is not integer dtype.

Notes

Uses flox for efficient, lazy groupby reductions on Dask-backed arrays. Data can be chunked along any dimension without restriction.

Examples:

>>> import xarray as xr
>>> import numpy as np
>>> from confusius.extract import extract_with_labels
>>>
>>> # 3D+t data: (time, z, y, x)
>>> data = xr.DataArray(
...     np.random.randn(100, 10, 20, 30),
...     dims=["time", "z", "y", "x"],
... )
>>> labels = xr.DataArray(
...     np.zeros((10, 20, 30), dtype=int),
...     dims=["z", "y", "x"],
... )
>>> labels[0, :, :] = 1  # Region 1: first z-slice.
>>> labels[1, :, :] = 2  # Region 2: second z-slice.
>>> signals = extract_with_labels(data, labels)
>>> signals.dims
('time', 'region')
>>> signals.coords["region"].values
array([1, 2])
>>>
>>> # Stacked mask format from Atlas.get_masks.
>>> mask = atlas_fusi.get_masks(["VISp", "AUDp"])
>>> signals = extract_with_labels(data, mask)
>>> signals.coords["region"].values
array(['VISp', 'AUDp'], dtype=object)

extract_with_mask ¶

extract_with_mask(
    data: DataArray, mask: DataArray
) -> DataArray

Extract signals from fUSI data using a binary mask.

This function flattens the spatial dimensions specified by the mask into a single space dimension, while preserving all other dimensions (e.g., time, pose).

Parameters:

data ¶
(DataArray) –

Input array with spatial dimensions matching the mask. Can have any number of non-spatial dimensions (e.g., time, pose). The spatial dimensions must match those in the mask.
mask ¶
(DataArray) –

Mask defining which voxels to extract. Its dimensions define the spatial dimensions that will be flattened. Must have boolean dtype, or integer dtype with exactly one non-zero value (0 = background, one region id = foreground). The latter format is produced by Atlas.get_masks. Coordinates must match data.

Returns:

DataArray –
Array with spatial dimensions flattened into a space dimension. All non-spatial dimensions are preserved. The space dimension has a MultiIndex storing spatial coordinates.

For example:
- (time, z, y, x) → (time, space)
- (time, pose, z, y, x) → (time, pose, space)
- (z, y, x) → (space,)
For simple round-trip reconstruction, use .unstack("space") which re-creates the original DataArray using the smallest bounding box containing the masked voxels. For full mask shape reconstruction, use confusius.extract.unmask.

Raises:

ValueError –

If mask dimensions don't match data's spatial dimensions, or if data has fewer than 2 spatial dimensions.
TypeError –

If mask is not boolean dtype.

Examples:

>>> import xarray as xr
>>> import numpy as np
>>> from confusius.extract import extract_with_mask
>>>
>>> # 3D+t data: (time, z, y, x)
>>> data = xr.DataArray(
...     np.random.randn(100, 10, 20, 30),
...     dims=["time", "z", "y", "x"],
... )
>>> mask = xr.DataArray(
...     np.random.rand(10, 20, 30) > 0.5,
...     dims=["z", "y", "x"],
... )
>>> signals = extract_with_mask(data, mask)
>>> signals.dims
("time", "space")
>>>
>>> # 3D+t data with extra dim: (time, pose, z, y, x)
>>> pose_data = xr.DataArray(
...     np.random.randn(100, 5, 10, 20, 30),
...     dims=["time", "pose", "z", "y", "x"],
... )
>>> pose_signals = extract_with_mask(pose_data, mask)
>>> pose_signals.dims
("time", "pose", "space")

unmask ¶

unmask(
    signals: ndarray | DataArray,
    mask: DataArray,
    new_dims: list[str] | None = None,
    new_dims_coords: dict[str, ndarray] | None = None,
    attrs: dict | None = None,
    fill_value: float = 0.0,
) -> DataArray

Reconstruct a fUSI DataArray from N-D signals using a mask.

Parameters:

signals ¶
(ndarray or DataArray) –
Array with shape (..., space) where ... can be any number of dimensions. The last dimension must correspond to masked voxels.
- If signals is a DataArray, it must have a space dimension as the last dimension. All other dimensions and their coordinates are preserved.
- If signals is a Numpy array, you can specify names and coordinates for the leading dimensions using new_dims and new_dims_coords. If not provided, dimensions are named ["dim_0", "dim_1", ...] with integer coordinates.
mask ¶
(DataArray) –

Boolean mask used for the original extraction. Provides spatial dimensions and coordinates for reconstruction. Must have the same spatial dimensions and coordinates as the original data.
new_dims ¶
(list of str, default: None ) –

Names for leading dimensions when signals is a Numpy array. Must match the number of leading dimensions (ndim - 1). If not provided, uses ["dim_0", "dim_1", ...]. Ignored if signals is a DataArray.
new_dims_coords ¶
(dict[str, ndarray], default: None ) –

Coordinates for leading dimensions when signals is a Numpy array. Keys must match dimension names in new_dims. If not provided, uses integer indices for all dimensions. Ignored if signals is a DataArray.
attrs ¶
(dict, default: None ) –

Attributes to attach to the output DataArray.
fill_value ¶
(float, default: 0.0 ) –

Value to fill in non-masked voxels.

Returns:

DataArray –

Reconstructed DataArray with shape (..., z, y, x) where spatial coordinates come from the mask.

Raises:

ValueError –

If signals shape doesn't match mask, or if new_dims/new_dims_coords are inconsistent with signals shape.

Examples:

>>> import xarray as xr
>>> import numpy as np
>>> from confusius.extract import extract_with_mask, unmask
>>> from sklearn.decomposition import PCA
>>>
>>> # Load data and mask
>>> data = xr.open_zarr("recording.zarr")["power_doppler"]
>>> mask = xr.open_zarr("brain_mask.zarr")["mask"]
>>>
>>> # Extract signals
>>> signals = extract_with_mask(data, mask)
>>>
>>> # Apply PCA
>>> pca = PCA(n_components=5)
>>> components = pca.fit_transform(signals.values)  # (time, 5)
>>>
>>> # Unmask - 2D case
>>> spatial_pca = unmask(
...     components.T,  # (5, n_voxels)
...     mask,
...     new_dims=["component"],
... )
>>> spatial_pca.dims
("component", "z", "y", "x")
>>>
>>> # Unmask - 3D case with custom coords
>>> pose_data = np.random.randn(5, 3, n_voxels)  # (component, pose, space)
>>> spatial_pose = unmask(
...     pose_data,
...     mask,
...     new_dims=["component", "pose"],
...     new_dims_coords={"component": [1, 2, 3, 4, 5], "pose": [0, 1, 2]},
... )
>>> spatial_pose.dims
("component", "pose", "z", "y", "x")

`confusius.extract`¶

extract ¶

extract_with_labels ¶

`data` ¶

`labels` ¶

`reduction` ¶

extract_with_mask ¶

`data` ¶

`mask` ¶

unmask ¶

`signals` ¶

`mask` ¶

`new_dims` ¶

`new_dims_coords` ¶

`attrs` ¶

`fill_value` ¶

confusius.extract¶

extract ¶

extract_with_labels ¶

data ¶

labels ¶

reduction ¶

extract_with_mask ¶

data ¶

mask ¶

unmask ¶

signals ¶

mask ¶

new_dims ¶

new_dims_coords ¶

attrs ¶

fill_value ¶

`confusius.extract`¶

`data` ¶

`labels` ¶

`reduction` ¶

`data` ¶

`mask` ¶

`signals` ¶

`mask` ¶

`new_dims` ¶

`new_dims_coords` ¶

`attrs` ¶

`fill_value` ¶