Skip to main content

Python library for fast access to seismic data using TileDB

Project description

TileDB logo

GitHub Actions status

TileDB-Segy

TileDB-Segy is a small MIT licensed Python library for easy interaction with seismic data, powered by TileDB. It combines an intuitive, segyio-like API with a powerful storage engine.

Feature summary

Available features

  • Converting from SEG-Y and Seismic Unix formatted seismic data to TileDB arrays.
  • Simple and powerful read-only API, closely modeled after segyio.
  • 100% unit test coverage.
  • Fully type-annotated.

Currently missing features

  • API for write operations.
  • Converting back to SEG-Y.
  • TileDB configuration and performance tuning.
  • Comprehensive documentation.
  • Real-world usage.

Installation

TileDB-Segy can be installed:

  • from PyPI by pip:

    pip install tiledb-segy
    
  • from source by cloning the Git repository:

    git clone https://github.com/TileDB-Inc/TileDB-Segy.git
    cd TileDB-Segy
    pip install .
    

    You may run the test suite with:

    python setup.py test
    

Converting from SEG-Y

TileDB-Segy comes with a commandline interface (CLI) called segy2tiledb for converting SEG-Y and Seismic Unix formatted files to TileDB formatted arrays. At minimum it takes an input file and generates a directory at the same parent directory with the input and extension .tsgy:

$ segy2tiledb a123.segy
$ du -sh a123.*
73M a123.sgy
55M a123.tsgy

To see the full list of options run:

$ segy2tiledb -h
usage: segy2tiledb [-h] [-o] [-g {auto,structured,unstructured}] [--su]
                   [--iline ILINE] [--xline XLINE]
                   [--endian {big,msb,little,lsb}] [-s TILE_SIZE]
                   [--consolidation-buffersize CONSOLIDATION_BUFFERSIZE]
                   input [output]

Convert a SEG-Y file to tiledb-segy format

positional arguments:
  input                 Input SEG-Y file path
  output                Output directory path (default: None)

optional arguments:
  -h, --help            show this help message and exit
  -o, --overwrite       Overwrite the output directory if it already exists (default: False)
  -g {auto,structured,unstructured}, --geometry {auto,structured,unstructured}
                        Output geometry:
                        - auto: same as the input SEG-Y.
                        - structured: same as `auto` but abort if a geometry cannot be inferred.
                        - unstructured: opt out on building geometry information.
                         (default: auto)

segyio options:
  --su                  Open a seismic unix file instead of SEG-Y (default: False)
  --iline ILINE         Inline number field in the trace headers (default: 189)
  --xline XLINE         Crossline number field in the trace headers (default: 193)
  --endian {big,msb,little,lsb}
                        File endianness, big/msb (default) or little/lsb (default: big)

tiledb options:
  -s TILE_SIZE, --tile-size TILE_SIZE
                        Tile size in bytes.
                        Larger tile size improves disk access time at the cost of higher memory (default: 4000000)
  --consolidation-buffersize CONSOLIDATION_BUFFERSIZE
                        The size in bytes of the attribute buffers used during consolidation (default: 5000000)

API

TileDB-Segy generally follows the segyio API; you may consult its documentation to learn about the public attributes (ilines, xlines, offsets, samples) and addressing modes (trace, header, attributes', iline, xline, fast, slow, depth_slice, gather, text, bin).

You can find usage examples in the following Jupyter notebooks:

Differences from segyio

  • Addressing modes that return a generator of numpy arrays in segyio, in tiledb-segy they return a single numpy array of higher dimension. For example, in a SEG-Y with 50 ilines, 20 xlines, 100 samples, and 3 offsets:

    • f.iline[0:5]:
      • segyio returns a generator that yields 5 2D numpy arrays of (20, 100) shape
      • tiledb-segy returns a 3D numpy array of (5, 20, 100) shape
    • f.iline[0:5, :]:
      • segyio returns a generator that yields 15 2D numpy arrays of (20, 100) shape
      • tiledb-segy returns a 4D numpy array of (5, 3, 20, 100) shape
  • The mappings returned by bin, header and attributes(name) have string keys instead of segyio.TraceField enums or integers.

  • tiledb.segy.open(dir_path), the segyio.open(file_path) equivalent, does not take any optional parameters (e.g. strict or ignore_geometry).

  • Unstructured and structured SEG-Y are represented as instances of two different classes, tiledb.segy.Segy and tiledb.segy.StructuredSegy respectively.

    • StructuredSegy extends Segy, so the whole unstructured API is inherited by the structured.
    • All attributes and addressing modes specific to structured files (e.g. ilines or gather) are available only to StructuredSegy. In contrast segyio returns None or raises an exception if these properties are accessed on unstructured files.
    • segyio.tools.dt is exposed as Segy.dt(fallback=4000.0) method.
    • segyio.tools.cube is exposed as StructuredSegy.cube() method.
    • There is no unstructured attribute; use not isinstance(f, StructuredSegy) instead.
  • There is no tracecount attribute; use len(trace) instead.

  • There is no ext_headers attribute; use len(text[1:]) instead.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tiledb-segy-0.2.1.tar.gz (2.6 MB view details)

Uploaded Source

Built Distribution

tiledb_segy-0.2.1-py3-none-any.whl (15.4 kB view details)

Uploaded Python 3

File details

Details for the file tiledb-segy-0.2.1.tar.gz.

File metadata

  • Download URL: tiledb-segy-0.2.1.tar.gz
  • Upload date:
  • Size: 2.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.1 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.5

File hashes

Hashes for tiledb-segy-0.2.1.tar.gz
Algorithm Hash digest
SHA256 868608abc322edc13f4b3e662f4d37d7305d7e29d61404aac1b2881ba91a2781
MD5 2b0087b55993dfb8f5f6089ec4c76b96
BLAKE2b-256 8b5a764fff1c7a9b3c78218403f33e21cbd122c0f5f7e150348b206593fe42a9

See more details on using hashes here.

File details

Details for the file tiledb_segy-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: tiledb_segy-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 15.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.1 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.8.5

File hashes

Hashes for tiledb_segy-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fd21868b4c3ca3be1f51b926b90cde76da7b1e79f7457317592d315c2899c5f4
MD5 847e764e8b259532f4c2701d054d3df2
BLAKE2b-256 c100fff18ca67f85cb3800495ec33d9cf7d33c21b0f7443c1d57f2041dd44423

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page