Skip to main content

Python library for loading and working with sound datasets.

Project description

soundata

Python library for downloading, loading & working with sound datasets. Find the API documentation here.
Inspired by and based on mirdata. (https://github.com/soundata/soundata)

CircleCI codecov Documentation Status GitHub

This library provides tools for working with common sound datasets, including tools for:

  • Downloading datasets to a common location and format
  • Validating that the files for a dataset are all present
  • Loading annotation files to a common format
  • Parsing clip-level metadata for detailed evaluations

Here's soundata's list of currently supported datasets.

Installation

To install, simply run:

pip install soundata

Quick example

import soundata

dataset = soundata.initialize('urbansound8k')
dataset.download()  # download the dataset
dataset.validate()  # validate that all the expected files are there

example_clip = dataset.choice_clip()  # choose a random example clip
print(example_clip)  # see the available data

See the documentation for more examples and the API reference.

Citing

@misc{fuentes_salamon2021soundata,
      title={Soundata: A Python library for reproducible use of audio datasets}, 
      author={Magdalena Fuentes and Justin Salamon and Pablo Zinemanas and Martín Rocamora and 
      Genís Plaja and Irán R. Román and Marius Miron and Xavier Serra and Juan Pablo Bello},
      year={2021},
      eprint={2109.12690},
      archivePrefix={arXiv},
      primaryClass={cs.SD}
}

When working with datasets, please cite the version of soundata that you are using AND include the reference of the dataset, which can be found in the respective dataset loader using the cite() method.

Contributing a new dataset loader

We welcome and encourage contributions to this library, especially new datasets. Please see contributing for guidelines.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soundata-0.1.0.tar.gz (5.3 MB view details)

Uploaded Source

Built Distribution

soundata-0.1.0-py3-none-any.whl (5.5 MB view details)

Uploaded Python 3

File details

Details for the file soundata-0.1.0.tar.gz.

File metadata

  • Download URL: soundata-0.1.0.tar.gz
  • Upload date:
  • Size: 5.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.0

File hashes

Hashes for soundata-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ae0744f942552b49d40ebcf91e2c471d80af1c31a933af6e66d27b83f476f4d6
MD5 67ed27714ae635bda91983e533e6d735
BLAKE2b-256 2881e84fbb524aa484d5a2696109668a37f7ae6fb75f0df5d3681ca5881d128b

See more details on using hashes here.

Provenance

File details

Details for the file soundata-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: soundata-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 5.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.0

File hashes

Hashes for soundata-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e9f9866d29c24c93ef7c9a92401633f22d57660f6de888257e6055a5aa9e6621
MD5 7ff1fc6d672555d8b1aa9fae8fc9bf60
BLAKE2b-256 3c2795c14af2dd8cb4c1e426eb74653f3089711d980781f07b8c02ee9d1d547c

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page