Skip to main content

xarray Datasets from CASA Tables.

Project description

https://img.shields.io/pypi/v/dask-ms.svg https://img.shields.io/travis/ska-sa/dask-ms.svg Documentation Status

Constructs xarray Datasets from CASA Tables via python-casacore. The Variables contained in the Dataset are dask arrays backed by deferred calls to pyrap.tables.table.getcol.

Supports writing Variables back to the respective column in the Table.

The intention behind this package is to support the Measurement Set as a data source and sink for the purposes of writing parallel, distributed Radio Astronomy algorithms.

Installation

To install with xarray support:

$ pip install dask-ms[xarray]

Without xarray similar, but reduced Dataset functionality is replicated in dask-ms itself. Expert users may wish to use this option to reduce python package dependencies.

$ pip install dask-ms

Example Usage

  import dask.array as da
  from daskms import xds_from_table, xds_to_table

  # Create xarray datasets from Measurement Set "WSRT.MS"
  ds = xds_from_table("WSRT.MS")
  # Set the flag Variable on first Dataset to it's inverse
  ds[0]['flag'] = (ds[0].flag.dims, da.logical_not(ds[0].flag))
  # Write the flag column back to the Measurement Set
  xds_to_table(ds, "WSRT.MS", "FLAG").compute()

  print ds

[<xarray.Dataset>
 Dimensions:         (chan: 64, corr: 4, row: 6552, uvw: 3)
 Coordinates:
     ROWID           (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
 Dimensions without coordinates: chan, corr, row, uvw
 Data variables:
     IMAGING_WEIGHT  (row, chan) float32 dask.array<shape=(6552, 64), chunksize=(6552, 64)>
     ANTENNA1        (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     STATE_ID        (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     EXPOSURE        (row) float64 dask.array<shape=(6552,), chunksize=(6552,)>
     MODEL_DATA      (row, chan, corr) complex64 dask.array<shape=(6552, 64, 4), chunksize=(6552, 64, 4)>
     FLAG_ROW        (row) bool dask.array<shape=(6552,), chunksize=(6552,)>
     CORRECTED_DATA  (row, chan, corr) complex64 dask.array<shape=(6552, 64, 4), chunksize=(6552, 64, 4)>
     PROCESSOR_ID    (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     WEIGHT          (row, corr) float32 dask.array<shape=(6552, 4), chunksize=(6552, 4)>
     FLAG            (row, chan, corr) bool dask.array<shape=(6552, 64, 4), chunksize=(6552, 64, 4)>
     TIME            (row) float64 dask.array<shape=(6552,), chunksize=(6552,)>
     SIGMA           (row, corr) float32 dask.array<shape=(6552, 4), chunksize=(6552, 4)>
     SCAN_NUMBER     (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     INTERVAL        (row) float64 dask.array<shape=(6552,), chunksize=(6552,)>
     OBSERVATION_ID  (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     TIME_CENTROID   (row) float64 dask.array<shape=(6552,), chunksize=(6552,)>
     ARRAY_ID        (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     ANTENNA2        (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     DATA            (row, chan, corr) complex64 dask.array<shape=(6552, 64, 4), chunksize=(6552, 64, 4)>
     FEED1           (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     FEED2           (row) int32 dask.array<shape=(6552,), chunksize=(6552,)>
     UVW             (row, uvw) float64 dask.array<shape=(6552, 3), chunksize=(6552, 3)>
 Attributes:
     FIELD_ID:      0
     DATA_DESC_ID:  0]

Documentation

https://dask-ms.readthedocs.io.

Limitations

  1. Many Measurement Sets columns are defined as variably shaped, but the actual data is fixed. dask-ms will infer the shape of the data from the first row and must be consistent with that of other rows. For example, this may be issue where multiple Spectral Windows are present in the Measurement Set with differing channels per SPW.

    dask-ms works around this by partitioning the Measurement Set into multiple datasets. The first row’s shape is used to infer the shape of the partition. Thus, in the case of multiple Spectral Window’s, we can partition the Measurement Set by DATA_DESC_ID to create a dataset for each Spectral Window.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dask-ms-0.2.1.tar.gz (68.2 kB view details)

Uploaded Source

Built Distribution

dask_ms-0.2.1-py3-none-any.whl (66.7 kB view details)

Uploaded Python 3

File details

Details for the file dask-ms-0.2.1.tar.gz.

File metadata

  • Download URL: dask-ms-0.2.1.tar.gz
  • Upload date:
  • Size: 68.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.7

File hashes

Hashes for dask-ms-0.2.1.tar.gz
Algorithm Hash digest
SHA256 e5336272898925454babbb37cf192a9beac027ce40e3369991b854812fcafe66
MD5 82050bd0979e611c61630595dd703972
BLAKE2b-256 55da26379d62550cc34ce90d60ba600a8e497641ff4c10244387295d69bc3720

See more details on using hashes here.

File details

Details for the file dask_ms-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: dask_ms-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 66.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.7

File hashes

Hashes for dask_ms-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ea007504c86395224d76d3173d4758c7d496a3e4165ac7310714e7103c30820d
MD5 e7fea3abd67f7bdd937f5efa244c6ad4
BLAKE2b-256 611949dc8c3bc57686d2862ce6d9806ba072e8cb06b017a51063ca77c3457c42

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page