xarray_beam.Key

class xarray_beam.Key(offsets=None, vars=None, indices=None)

Key for keeping track of chunks of a distributed Dataset.

Key objects in Xarray-Beam include two components:

  • offsets: an immutable dict indicating integer offsets (total number of array elements) from the origin along each dimension for this chunk.

  • vars: either an frozenset or None, indicating the subset of Dataset variables included in this chunk. The default value of None means that all variables are included.

Alternatively, indices may be specified instead of offsets. This is a newer data model that is not yet fully supported:

  • indices: an immutable dict indicating integer chunk indices from the origin along each dimension for this chunk.

offsets and indices are mutually exclusive: only one of them may be used for any given Key. For example, if there are chunks of size 100 along the ‘x’ dimension, then offsets={'x': 400} would correspond to indices={'x': 4}.

Key objects are “deterministically encoded” by Beam, which makes them suitable for use as keys in Beam pipelines, i.e., with beam.GroupByKey. They are also immutable and hashable, which makes them usable as keys in Python dictionaries.

Example usage:

>>> key = xarray_beam.Key(offsets={'x': 10}, vars={'foo'})

>>> key
Key(offsets={'x': 10}, vars={'foo'})

>>> key.offsets
immutabledict({'x': 10})

>>> key.vars
frozenset({'foo'})

To replace some offsets:

>>> key.with_offsets(y=0)  # insert
Key(offsets={'x': 10, 'y': 0}, vars={'foo'})

>>> key.with_offsets(x=20)  # override
Key(offsets={'x': 20}, vars={'foo'})

>>> key.with_offsets(x=None)  # remove
Key(offsets={}, vars={'foo'})

To entirely replace offsets or variables:

>>> key.replace(offsets={'y': 0})
Key(offsets={'y': 0}, vars={'foo'})

>>> key.replace(vars=None)
Key(offsets={'x': 10})

You can use indices instead of offsets to refer to chunks by index:

>>> key = xarray_beam.Key(indices={'x': 4}, vars={'bar'})
>>> key
Key(indices={'x': 4}, vars={'bar'})
>>> key.with_indices(x=5)
Key(indices={'x': 5}, vars={'bar'})
Parameters:
  • offsets (Mapping[str, int] | None)

  • vars (Set[str] | None)

  • indices (Mapping[str, int] | None)

__init__(offsets=None, vars=None, indices=None)[source]
Parameters:
  • offsets (Mapping[str, int] | None)

  • vars (Set[str] | None)

  • indices (Mapping[str, int] | None)

Methods

__init__([offsets, vars, indices])

replace([offsets, vars, indices])

Replace one or more components of this Key with new values.

with_indices(**indices)

Replace some indices with new values.

with_offsets(**offsets)

Replace some offsets with new values.