xarray_beam.Dataset¶
- class xarray_beam.Dataset(template, chunks, split_vars, ptransform)¶
Experimental high-level representation of an Xarray-Beam dataset.
- Parameters:
template (xarray.Dataset)
chunks (Mapping[str, int])
split_vars (bool)
ptransform (beam.PTransform | beam.PCollection | _LazyPCollection)
- __init__(template, chunks, split_vars, ptransform)[source]¶
Low level interface for creating a new Dataset, without validation.
Unless you’re really sure you don’t need validation, prefer using
xarray_beam.Dataset.from_ptransform.- Parameters:
template (Dataset) – xarray.Dataset describing the structure of this dataset, typically as produced by
xarray_beam.make_template().chunks (Mapping[str, int]) – mapping from dimension names to chunk sizes. For normalization, use
xarray_beam.normalize_chunks().split_vars (bool) – whether variables are split between separate elements in the ptransform, or all stored in the same element.
ptransform (PTransform | PCollection | _LazyPCollection) – Beam collection of
(xbeam.Key, xarray.Dataset)tuples with this dataset’s data.
Methods
__init__(template, chunks, split_vars, ...)Low level interface for creating a new Dataset, without validation.
Collect a dataset in memory by writing it to a temp file.
consolidate_variables(*[, label])Consolidate variables in this Dataset into a single chunk.
from_ptransform(ptransform, *, template, chunks)Create an xarray_beam.Dataset from a Beam PTransform.
from_xarray(source, chunks, *[, split_vars, ...])Create an xarray_beam.Dataset from an xarray.Dataset.
from_zarr(path, *[, chunks, split_vars, ...])Create an xarray_beam.Dataset from a Zarr store.
head(*[, label])Return a Dataset with the first N elements of each dimension.
map_blocks(func, *[, template, chunks, label])Map a function over the chunks of this dataset.
mean([dim, skipna, dtype, label])Compute the mean of this Dataset using Beam combiners.
pipe(func, *args, **kwargs)Apply a function to this dataset with method-chaining syntax.
rechunk(chunks[, split_vars, min_mem, ...])Rechunk this Dataset.
split_variables(*[, label])Split variables in this Dataset into separate chunks.
tail(*[, label])Return a Dataset with the last N elements of each dimension.
to_zarr(path, *[, zarr_chunks_per_shard, ...])Write this dataset to a Zarr file.
transpose(*args[, label])Attributes
bytes_per_chunkEstimate of the number of bytes per dataset chunk.
chunk_countCount the number of chunks in this dataset.
chunksDictionary mapping from dimension names to chunk sizes.
itemsizeTotal size of dtype itemsizes in an PTransform element, in bytes.
ptransformBeam PTransform of (xbeam.Key, xarray.Dataset) with this dataset's data.
sizesSize of each dimension on this dataset.
split_varsWhether variables are split between separate elements in the ptransform.
templateTemplate describing the structure of this dataset.