audformat

Python implementation of audformat

These details have not been verified by PyPI

Project links

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering

Project description

Specification and reference implementation of audformat.

audformat stores media data, such as audio or video, together with corresponding annotations in a pre-defined way. This makes it easy to combine or replace databases in machine learning projects.

An audformat database is a folder that contains media files together with a header YAML file and one or several files storing the annotations. The database is represented as an audformat.Database object and can be loaded with audformat.Database.load() or written to disk with audformat.Database.save().

Have a look at the installation and usage instructions and the format specifications as a starting point.

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Version 0.15.1 (2022-08-11)

Added: audformat.Scheme.uses_table to indicate if the scheme uses a misc table to store its labels
Added: usage example to docstring of audfromat.utils.to_segmented_index()
Changed: forbid nesting of misc tables as scheme labels
Fixed: support for pd.Index and pd.Series in audformat.utils.to_filewise_index()
Fixed: description of audformat.Schemes.labels in API documentation

Version 0.15.0 (2022-08-05)

Added: audformat.MiscTable which can store data not associated with media files
Added: store scheme labels in a misc table
Added: dictionary audformat.Database.misc_tables holding misc tables of a database
Added: audformat.utils.difference() for finding index entries that are only part of a single index for a given sequence of indices
Added: audformat.utils.is_index_alike() for checking if a sequence of indices has the same number of levels, level names, and matching dtypes
Added: audformat.define.DataType.OBJECT
Added: audformat.utils.set_index_dtypes() to change dtypes of an index
Added: audformat.testing.add_misc_table()
Added: audformat.Database.__iter__ iterates through all (misc) tables, e.g. a user can do list(db) to get a list of all (misc) tables
Changed: audformat.Database.update() can now join schemes with different labels
Changed: audformat.utils.union(), audformat.utils.intersect(), and audformat.utils.concat() now support any kind of index
Changed: audformat.utils.intersect() no longer removes segments from a segmented index that are contained in a filewise index
Changed: require pandas>=1.4.1
Changed: use pandas dtype 'string' instead of 'object' for storing audformat dtype 'str' entries
Changed: use a misc table to store the 'speaker' scheme labels in the emodb example in the documentation
Changed: audformat.utils.join_labels() raises ValueError if labels are of different dtype
Fixed: ensure column IDs are different from index level names
Fixed: make sure audformat.Column.set() converts data to dtype of scheme before checking if values are in min-max-range of scheme
Fixed: links to pandas API in the documentation
Fixed: include methods to_dict(), from_dict(), dump(), and attributes description, meta in the documentation for the classes audformat.Column, audformat.Database, audformat.Media, audformat.Rater, audformat.Scheme, audformat.Split, audformat.Table
Fixed: type hint of argument dtype in the documentation of audformat.Scheme
Removed: support for Python 3.7

Version 0.14.3 (2022-06-01)

Added: audformat.utils.map_country()
Changed: improve speed of audformat.Table.drop_files() for segmented tables

Version 0.14.2 (2022-04-29)

Added: audformat.utils.index_has_overlap()
Added: audformat.utils.iter_index_by_file()
Changed: store categories with integers as int64 instead of Int64
Changed: require audeer>=1.18.0
Changed: support pandas>=1.4.0

Version 0.14.1 (2022-03-03)

Added: audformat.utils.map_file_path()

Version 0.14.0 (2022-02-24)

Changed: ensure audformat.testing.create_database() uses Unix path separators
Changed: don’t allow \ path entries in a portable database
Changed: mark deprecated root argument of audformat.testing.create_audio_files() to be removed in version 1.0.0

Version 0.13.3 (2022-02-07)

Fixed: conversion of pickle protocol 5 files to pickle protocol 4 in cache

Version 0.13.2 (2022-01-27)

Fixed: reintroduce sorting the output of audformat.Database.files and audformat.Database.segments

Version 0.13.1 (2022-01-26)

Fixed: changelog for 0.13.0

Version 0.13.0 (2022-01-26)

Changed: audformat.utils.union() no longer sorts levels
Changed: audformat.Table.save() forces pickle format 4
Changed: clean up test requirements
Changed: require pandas < 1.4.0

Version 0.12.4 (2022-01-12)

Changed: the API documentation on the language argument of audformat.Database is more verbose now
Changed: the difference between audformat.define.DataType.TIME and audformat.define.DataType.DATE is now discussed in the API documentation
Fixed: saving a not loaded table to CSV when a PKL file is present
Fixed: pandas deprecation warnings

Version 0.12.3 (2022-01-03)

Removed: Python 3.6 support

Version 0.12.2 (2021-11-18)

Added: audformat.assert_no_duplicates()
Changed: audformat.assert_index() no longer checks for duplicates

Version 0.12.1 (2021-11-17)

Added: audformat.utils.hash()
Added: audformat.utils.expand_file_path()
Added: audformat.utils.replace_file_extension()
Changed: use yaml.CLoader for faster header reading

Version 0.12.0 (2021-11-10)

Added: as_segmented, allow_nat, root, num_workers arguments to audformat.Table.get()
Added: as_segmented, allow_nat, root, num_workers arguments to audformat.Column.get()
Added: files_duration argument to audformat.utils.to_segmented_index()
Added: audformat.Database.files_duration()
Changed: changed default value of load_data argument in audformat.Database.load() to False
Changed: speed up audformat.Database.files and audformat.Database.segments
Fixed: re-add support for pandas>=1.3

Version 0.11.6 (2021-08-20)

Added: support for Python 3.9
Fixed: speed up audformat.utils.union()
Fixed: audformat.Column.set() with pd.Series and np.array for a scheme with fixed labels and containing NaN values

Version 0.11.5 (2021-08-09)

Removed: duration scheme and column from conventions and emodb example

Version 0.11.4 (2021-08-05)

Added: custom BadKeyError when key is not found
Changed: limit to pandas <1.3 until it works again for newer pandas versions
Changed: remove the <1.0.0 limit for audiofile as a stable release is available and the API has not changed

Version 0.11.3 (2021-06-10)

Added: audformat.utils.duration
Fixed: description of audformat.Database.is_portable in documentation

Version 0.11.2 (2021-05-12)

Added: audformat.utils.join_schemes

Version 0.11.1 (2021-05-11)

Added: Database.is_portable
Added: copy_media argument to Database.update()
Changed: remove root argument from testing.create_audio_files() and instead use Database.root
Fixed: utils.concat() converts to nullable dtype
Fixed: utils.concat() returns DataFrame if input contains at least one DataFrame

Version 0.11.0 (2021-05-06)

Note: tables stored from this version upwards cannot be loaded with older versions

Added: Database.root
Added: utils.join_labels()
Added: Scheme.replace_labels()
Changed: set dependency to pandas>=1.1.5
Changed: do not compress pickled table files

Version 0.10.2 (2021-04-22)

Changed: allow_nat argument to utils.to_segmented_index()

Version 0.10.1 (2021-03-31)

Fixed: audformat.assert_index() checks for correct dtypes

Version 0.10.0 (2021-03-18)

Added: audformat.Database.update()
Added: audformat.Table.update()
Added: overwrite argument to audformat.utils.concat()
Changed: result of audformat.Table.__add__() is no longer assigned to a audformat.Database

Version 0.9.8 (2021-02-23)

Added: audformat.Database.license
Added: audformat.Database.license_url
Added: audformat.Database.author
Added: audformat.Database.organization
Added: audformat.utils.intersect() for index objects
Added: audformat.utils.union() for index objects
Changed: Database.load() raises error if table file missing
Changed: forbid duplicates in audformat conform indices
Fixed: audformat.Table.__add__() returned wrong values for some index combinations

Version 0.9.7 (2021-02-01)

Added: update_other_formats argument to audformat.Table.save() to make sure existing files in other formats are updated as well
Changed: use round_trip argument when loading CSV files to ensure dataframes are equal after storing and loading again

Version 0.9.6 (2021-01-28)

Fixed: implemented audformat.Database.__eq__ and return True for identical databases

Version 0.9.5 (2021-01-14)

Changed: use nullable Pandas’ type 'boolean' for bool schemes
Fixed: Scheme.draw() generates boolean values if scheme is bool

Version 0.9.4 (2021-01-11)

Changed: add arguments num_workers and verbose to audformat.Database.load()

Version 0.9.3 (2021-01-07)

Fixed: avoid sphinx syntax in CHANGELOG

Version 0.9.2 (2021-01-07)

Changed: add arguments num_workers and verbose to audformat.Database.drop_files(), audformat.Database.map_files(), audformat.Database.pick_files(), audformat.Database.save()
Changed: audformat.segmented_index() support int and float, which will be interpreted as seconds
Fixed: audformat.utils.to_segmented_index() returns correct index type for NaT

Version 0.9.1 (2020-12-21)

Fixed: add column name to HTML Series output in docs
Fixed: removed mentioning of NotConformToUnifiedFormat error and RedundantArgumentError error
Fixed: add missing errors to docstring of audformat.Table.set() and audformat.Column.set()

Version 0.9.0 (2020-12-18)

Added: initial release public release

Project details

These details have not been verified by PyPI

Project links

Development Status
- 5 - Production/Stable
Intended Audience
- Developers
- Science/Research
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
Topic
- Scientific/Engineering

Release history Release notifications | RSS feed

1.3.1

Sep 16, 2024

1.3.0

Jul 18, 2024

1.2.0

Jun 25, 2024

1.1.4

May 15, 2024

1.1.3

Apr 26, 2024

1.1.2

Feb 2, 2024

1.1.1

Jan 25, 2024

1.1.0

Nov 30, 2023

1.0.3

Oct 11, 2023

1.0.2

Oct 9, 2023

1.0.1

May 8, 2023

1.0.0

Apr 27, 2023

0.16.1

Mar 29, 2023

0.16.0

Jan 12, 2023

0.15.4

Nov 1, 2022

0.15.3

Sep 19, 2022

0.15.2

Aug 17, 2022

This version

0.15.1

Aug 11, 2022

0.15.0

Aug 5, 2022

0.14.3

Jun 1, 2022

0.14.2

May 2, 2022

0.14.1

Mar 3, 2022

0.14.0

Feb 24, 2022

0.13.3

Feb 7, 2022

0.13.2

Jan 27, 2022

0.13.1

Jan 26, 2022

0.13.0

Jan 26, 2022

0.12.4

Jan 14, 2022

0.12.3

Jan 3, 2022

0.12.2

Nov 18, 2021

0.12.1

Nov 17, 2021

0.12.0

Nov 10, 2021

0.11.6

Aug 20, 2021

0.11.5

Aug 9, 2021

0.11.4

Aug 5, 2021

0.11.3

Jun 10, 2021

0.11.2

May 12, 2021

0.11.1

May 11, 2021

0.11.0

May 6, 2021

0.10.2

Apr 22, 2021

0.10.1

Mar 31, 2021

0.10.0

Mar 18, 2021

0.9.8

Feb 23, 2021

0.9.7

Feb 1, 2021

0.9.6

Jan 28, 2021

0.9.5

Jan 14, 2021

0.9.4

Jan 11, 2021

0.9.3

Jan 7, 2021

0.9.1

Dec 21, 2020

0.9.0

Dec 18, 2020

0.9.0rc3 pre-release

Dec 18, 2020

0.9.0rc2 pre-release

Dec 18, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audformat-0.15.1.tar.gz (104.5 kB view details)

Uploaded Aug 11, 2022 Source

Built Distribution

audformat-0.15.1-py3-none-any.whl (59.3 kB view details)

Uploaded Aug 11, 2022 Python 3

File details

Details for the file audformat-0.15.1.tar.gz.

File metadata

Download URL: audformat-0.15.1.tar.gz
Upload date: Aug 11, 2022
Size: 104.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for audformat-0.15.1.tar.gz
Algorithm	Hash digest
SHA256	`b3119bdc2e18a054025f9bffae64b3a3d75c4063d29102aa98d877272dad752d`
MD5	`6cd6afb3d7db880dff53fd51bb662b01`
BLAKE2b-256	`84b97ba2d49a566526a97faa50457cd8e8d587e602e61c64d81fa532c9fe80d7`

See more details on using hashes here.

File details

Details for the file audformat-0.15.1-py3-none-any.whl.

File metadata

Download URL: audformat-0.15.1-py3-none-any.whl
Upload date: Aug 11, 2022
Size: 59.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.8.13

File hashes

Hashes for audformat-0.15.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4344d5b9e39425c1022a63c5c37abcc340ef013b87b6b97859bf38c893370f9f`
MD5	`0880a77e95ba4172dd83a129624c7ec4`
BLAKE2b-256	`874413091302d947d58630812050c8e71ffbdd4f74caa4bd22e08f56f84bf8e7`

See more details on using hashes here.

audformat 0.15.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Changelog

Version 0.15.1 (2022-08-11)

Version 0.15.0 (2022-08-05)

Version 0.14.3 (2022-06-01)

Version 0.14.2 (2022-04-29)

Version 0.14.1 (2022-03-03)

Version 0.14.0 (2022-02-24)

Version 0.13.3 (2022-02-07)

Version 0.13.2 (2022-01-27)

Version 0.13.1 (2022-01-26)

Version 0.13.0 (2022-01-26)

Version 0.12.4 (2022-01-12)

Version 0.12.3 (2022-01-03)

Version 0.12.2 (2021-11-18)

Version 0.12.1 (2021-11-17)

Version 0.12.0 (2021-11-10)

Version 0.11.6 (2021-08-20)

Version 0.11.5 (2021-08-09)

Version 0.11.4 (2021-08-05)

Version 0.11.3 (2021-06-10)

Version 0.11.2 (2021-05-12)

Version 0.11.1 (2021-05-11)

Version 0.11.0 (2021-05-06)

Version 0.10.2 (2021-04-22)

Version 0.10.1 (2021-03-31)

Version 0.10.0 (2021-03-18)

Version 0.9.8 (2021-02-23)

Version 0.9.7 (2021-02-01)

Version 0.9.6 (2021-01-28)

Version 0.9.5 (2021-01-14)

Version 0.9.4 (2021-01-11)

Version 0.9.3 (2021-01-07)

Version 0.9.2 (2021-01-07)

Version 0.9.1 (2020-12-21)

Version 0.9.0 (2020-12-18)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes