Frictionless is a data framework

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
Topic
- Software Development :: Libraries :: Python Modules

Project description

frictionless-py

Frictionless is a framework to describe, extract, validate, and transform tabular data. It supports a great deal of data sources and formats, as well as provides popular platforms integrations. The framework is powered by the lightweight yet comprehensive Frictionless Data Specifications.

Features

Describe your data: You can infer, edit and save metadata of your data tables. It's a first step for ensuring data quality and usability. Frictionless metadata includes general information about your data like textual description, as well as, field types and other tabular data details.
Extract your data: You can read your data using a unified tabular interface. Data quality and consistency are guaranteed by a schema. Frictionless supports various file protocols like HTTP, FTP, and S3 and data formats like CSV, XLS, JSON, SQL, and others.
Validate your data: You can validate data tables, resources, and datasets. Frictionless generates a unified validation report, as well as supports a lot of options to customize the validation process.
Transform your data: You can clean, reshape, and transfer your data tables and datasets. Frictionless provides a pipeline capability and a lower-level interface to work with the data.

powerful command-line interface
low memory consumption for data of any size
reasonable performance on big data
support for compressed files
custom checks and formats
fully pluggable architecture
the included API server
more than 1000+ tests

Installation

Versioning follows the SemVer Standard

$ pip install frictionless
$ pip install frictionless[sql]  # to install a core plugin

By default, the framework comes with the support of CSV, Excel, and JSON formats. Please use the command above to add support for SQL, Pandas, Html, and others. Usually, you don't need to think about it in advance - frictionless will show a useful error on a missing plugin with installation instruction.

Usage

The framework can be used:

as a Python library
as a command-line interface
as a restful API server

For example, all the examples below do the same thing:

from frictionless import extract

extract('data/table.csv')
# CLI: $ frictionless extract data/table.csv
# API: [POST] /extract {"source': 'data/table.csv"}

All these interfaces are close as much as possible regarding naming and the way you interact with them. Usually, it's straightforward to translate e.g., Python code to a command-line call. Frictionless provides code completion for Python and command-line, which should help to get useful hints in real-time.

Arguments follow this naming rule:

for Python interfaces, they are lowercased, e.g. missing_values
within dictionaries or JSON objects they are camel-cased, e.g. missingValues
in a command line they use dashes, e.g. --missing-values

To get documentation for a command-line interface just use the --help flag:

$ frictionless --help
$ frictionless describe --help
$ frictionless extract --help
$ frictionless validate --help
$ frictionless transform --help

Example

All the examples use the data folder from this repository

We will take a very dirty data file:

$ cat data/invalid.csv
id,name,,name
1,english
1,english

2,german,1,2,3

Firt of all, let's infer the metadata. We can save and edit it to provide useful information about the table:

$ frictionless describe data/invalid.csv
[metadata] data/invalid.csv

bytes: 50
compression: 'no'
compressionPath: ''
dialect: {}
encoding: utf-8
format: csv
hash: 8c73c3d9d59088dcb2508e0b348bf8a8
hashing: md5
name: invalid
path: data/invalid.csv
profile: tabular-data-resource
rows: 4
schema:
  fields:
    - name: id
      type: integer
    - name: name
      type: string
    - name: field3
      type: integer
    - name: name2
      type: integer
scheme: file

Secondly, we can extract the cleaned data. It conforms to the inferred schema from above e.g., the dimension is fixed, and bad cells are omitted:

$ frictionless extract data/invalid.csv
[data] data/invalid.csv

  id  name       field3    name2
----  -------  --------  -------
   1  english
   1  english

   2  german          1        2

Last but not least, let's get a validation report. This report will help us to fix all these errors as comprehensive information is provided for every tabular problem:

$ frictionless validate data/invalid.csv
[invalid] data/invalid.csv

row    field    code              message
-----  -------  ----------------  ------------------------------------------------------------------------------------------------
-      3        blank-header      Header in field at position "3" is blank
-      4        duplicate-header  Header "name" in field at position "4" is duplicated to header in another field: at position "2"
2      3        missing-cell      Row at position "2" has a missing cell in field "field3" at position "3"
2      4        missing-cell      Row at position "2" has a missing cell in field "name2" at position "4"
3      3        missing-cell      Row at position "3" has a missing cell in field "field3" at position "3"
3      4        missing-cell      Row at position "3" has a missing cell in field "name2" at position "4"
4      -        blank-row         Row at position "4" is completely blank
5      5        extra-cell        Row at position "5" has an extra value in field at position "5"

Now having all this information:

we can clean up the table to ensure the data quality
we can use the metadata to describe and share the dataset
we can include the validation into our workflow to guarantee validity
and much more: don't hesitate and read the documentation below!

Documentation

This readme gives a high-level overview of the framework. A detailed documentation is avialable and here is a table of contents:

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
Topic
- Software Development :: Libraries :: Python Modules

Release history Release notifications | RSS feed

5.18.0

Sep 28, 2024

5.17.1

Aug 30, 2024

5.17.0

Apr 29, 2024

5.16.1

Jan 24, 2024

5.16.0

Oct 2, 2023

5.15.12

Oct 2, 2023

5.15.11

Oct 2, 2023

5.15.10

Jul 21, 2023

5.15.9

Jul 18, 2023

5.15.8

Jul 18, 2023

5.15.7

Jul 18, 2023

5.15.6

Jul 18, 2023

5.15.5

Jul 17, 2023

5.15.4

Jul 16, 2023

5.15.3

Jul 16, 2023

5.15.2

Jul 15, 2023

5.15.1

Jul 13, 2023

5.15.0

Jul 11, 2023

5.14.5

Jul 3, 2023

5.14.4

Jul 3, 2023

5.14.3

Jul 3, 2023

5.14.2

Jul 1, 2023

5.14.1

Jul 1, 2023

5.14.0

Jul 1, 2023

5.13.1

May 3, 2023

5.12.1

Apr 24, 2023

5.12.0

Apr 18, 2023

5.11.1

Apr 13, 2023

5.11.0

Apr 13, 2023

5.10.5

Apr 7, 2023

5.10.4

Apr 5, 2023

5.10.3

Mar 29, 2023

5.10.2

Mar 28, 2023

5.10.1

Mar 20, 2023

5.10.0

Mar 15, 2023

5.8.3

Mar 7, 2023

5.8.2

Mar 7, 2023

5.8.1

Mar 1, 2023

5.8.0

Mar 1, 2023

5.7.2

Feb 28, 2023

5.7.1

Feb 27, 2023

5.7.0

Feb 27, 2023

5.6.3

Feb 21, 2023

5.6.2

Feb 18, 2023

5.6.1

Feb 18, 2023

5.6.0

Feb 17, 2023

5.5.10

Feb 16, 2023

5.5.9

Feb 15, 2023

5.5.8

Feb 15, 2023

5.5.7

Feb 15, 2023

5.5.6

Feb 10, 2023

5.5.5

Feb 6, 2023

5.5.4

Feb 6, 2023

5.5.3

Feb 3, 2023

5.5.2

Feb 1, 2023

5.5.1

Jan 23, 2023

5.5.0

Jan 20, 2023

5.4.0

Jan 17, 2023

5.3.0

Jan 12, 2023

5.2.3

Jan 12, 2023

5.2.2

Jan 11, 2023

5.1.2

Jan 11, 2023

5.1.1

Jan 11, 2023

5.1.0

Jan 11, 2023

5.0.4

Jan 10, 2023

5.0.3

Jan 10, 2023

5.0.0b23 pre-release

Jan 3, 2023

5.0.0b22 pre-release

Dec 6, 2022

5.0.0b21 pre-release

Dec 1, 2022

5.0.0b18 pre-release

Nov 24, 2022

5.0.0b17 pre-release

Nov 24, 2022

5.0.0b16 pre-release

Nov 24, 2022

5.0.0b15 pre-release

Nov 23, 2022

5.0.0b14 pre-release

Nov 19, 2022

5.0.0b13 pre-release

Nov 18, 2022

5.0.0b12 pre-release

Nov 9, 2022

5.0.0b10 pre-release

Oct 27, 2022

5.0.0b9 pre-release

Sep 13, 2022

5.0.0b8 pre-release

Sep 12, 2022

5.0.0b7 pre-release

Aug 26, 2022

5.0.0b6 pre-release

Aug 26, 2022

5.0.0b5 pre-release

Aug 26, 2022

5.0.0b4 pre-release

Aug 26, 2022

5.0.0b2 pre-release

Aug 25, 2022

5.0.0b1 pre-release

Aug 24, 2022

4.40.11

Feb 17, 2023

4.40.9

Jan 23, 2023

4.40.8

Aug 25, 2022

4.40.7

Aug 22, 2022

4.40.6

Aug 12, 2022

4.40.5

Jul 18, 2022

4.40.4

Jul 18, 2022

4.40.3

Jun 21, 2022

4.40.2

Jun 21, 2022

4.40.1

Jun 21, 2022

4.40.0

Jun 20, 2022

4.39.0

Jun 20, 2022

4.38.0

May 17, 2022

4.37.0

May 14, 2022

4.36.0

May 14, 2022

4.35.0

May 14, 2022

4.34.0

May 11, 2022

4.33.0

May 11, 2022

4.32.1

May 10, 2022

4.32.0

May 3, 2022

4.31.0

Apr 27, 2022

4.29.0

Apr 15, 2022

4.28.3

Apr 11, 2022

4.28.2

Apr 1, 2022

4.28.1

Mar 30, 2022

4.28.0

Mar 28, 2022

4.27.0

Mar 28, 2022

4.26.2

Mar 28, 2022

4.26.1

Mar 21, 2022

4.26.0

Feb 21, 2022

4.25.1

Feb 4, 2022

4.25.0

Feb 3, 2022

4.24.0

Jan 25, 2022

4.23.2

Jan 18, 2022

4.23.1

Jan 17, 2022

4.23.0

Dec 22, 2021

4.22.3

Dec 8, 2021

4.22.2

Dec 7, 2021

4.22.1

Nov 25, 2021

4.22.0

Nov 16, 2021

4.21.2

Nov 16, 2021

4.21.1

Nov 16, 2021

4.21.0

Nov 16, 2021

4.20.2

Nov 8, 2021

4.20.1

Nov 8, 2021

4.19.6

Nov 2, 2021

4.19.5

Oct 16, 2021

4.19.4

Oct 16, 2021

4.19.3

Oct 16, 2021

4.19.2

Oct 15, 2021

4.19.1

Oct 15, 2021

4.18.2

Oct 4, 2021

4.18.1

Sep 27, 2021

4.18.0

Sep 27, 2021

4.17.5

Sep 26, 2021

4.17.3

Sep 25, 2021

4.17.1

Sep 23, 2021

4.17.0

Sep 23, 2021

4.16.8

Sep 20, 2021

4.16.6

Aug 30, 2021

4.16.3

Aug 27, 2021

4.16.2

Aug 5, 2021

4.16.1

Aug 5, 2021

4.16.0

Aug 5, 2021

4.15.0

Aug 5, 2021

4.14.2

Jul 30, 2021

4.14.1

Jul 30, 2021

4.14.0

Jul 16, 2021

4.13.0

Jul 16, 2021

4.12.9

Jul 14, 2021

4.12.7

Jul 14, 2021

4.12.6

Jul 14, 2021

4.12.5

Jul 9, 2021

4.12.4

Jun 25, 2021

4.12.3

Jun 25, 2021

4.12.2

Jun 16, 2021

4.12.1

Jun 15, 2021

4.11.0

Jun 4, 2021

4.10.7

May 28, 2021

4.10.6

May 21, 2021

4.10.5

May 20, 2021

4.10.4

May 19, 2021

4.10.3

May 19, 2021

4.10.2

May 19, 2021

4.10.1

May 19, 2021

4.10.0

May 4, 2021

4.9.5

May 4, 2021

4.9.4

May 4, 2021

4.9.3

May 4, 2021

4.9.2

Apr 28, 2021

4.9.1

Apr 28, 2021

4.9.0

Apr 28, 2021

4.8.1

Apr 24, 2021

4.8.0

Apr 23, 2021

4.7.5

Apr 23, 2021

4.7.4

Apr 22, 2021

4.7.2

Apr 22, 2021

4.7.1

Apr 14, 2021

4.7.0

Apr 14, 2021

4.6.1

Apr 14, 2021

4.6.0

Apr 7, 2021

4.5.2

Apr 7, 2021

4.5.1

Apr 6, 2021

4.5.0

Apr 6, 2021

4.4.0

Apr 5, 2021

4.3.2

Apr 3, 2021

4.3.0

Mar 26, 2021

4.2.2

Mar 26, 2021

4.2.1

Mar 23, 2021

4.2.0

Mar 23, 2021

4.1.0

Mar 9, 2021

4.0.13

Mar 9, 2021

4.0.12

Mar 3, 2021

4.0.11

Mar 2, 2021

4.0.10

Mar 1, 2021

4.0.9

Mar 1, 2021

4.0.8

Mar 1, 2021

4.0.7

Mar 1, 2021

4.0.6

Mar 1, 2021

4.0.5

Feb 26, 2021

4.0.4

Feb 25, 2021

4.0.3

Feb 15, 2021

4.0.2

Feb 11, 2021

4.0.1

Feb 10, 2021

4.0.0

Feb 10, 2021

4.0.0a17 pre-release

Feb 10, 2021

4.0.0a16 pre-release

Feb 8, 2021

4.0.0a15 pre-release

Feb 8, 2021

4.0.0a14 pre-release

Feb 4, 2021

4.0.0a13 pre-release

Feb 3, 2021

4.0.0a12 pre-release

Feb 2, 2021

4.0.0a11 pre-release

Feb 2, 2021

4.0.0a10 pre-release

Jan 27, 2021

4.0.0a9 pre-release

Jan 26, 2021

4.0.0a8 pre-release

Jan 21, 2021

4.0.0a7 pre-release

Jan 18, 2021

4.0.0a6 pre-release

Jan 12, 2021

4.0.0a5 pre-release

Jan 12, 2021

4.0.0a4 pre-release

Jan 12, 2021

4.0.0a3 pre-release

Jan 12, 2021

4.0.0a2 pre-release

Jan 11, 2021

4.0.0a1 pre-release

Jan 7, 2021

3.48.0

Dec 25, 2020

3.47.3

Dec 21, 2020

3.47.2

Dec 21, 2020

3.47.1

Dec 21, 2020

3.47.0

Dec 20, 2020

3.46.0

Dec 13, 2020

3.45.5

Dec 7, 2020

3.45.4

Dec 7, 2020

3.45.1

Dec 7, 2020

3.45.0

Dec 6, 2020

3.44.0

Dec 6, 2020

3.43.0

Dec 5, 2020

3.42.0

Dec 5, 2020

3.41.0

Dec 5, 2020

3.39.0

Dec 3, 2020

3.38.2

Dec 2, 2020

3.38.1

Dec 1, 2020

3.37.0

Nov 30, 2020

3.36.0

Nov 29, 2020

3.35.0

Nov 28, 2020

3.34.4

Nov 27, 2020

3.34.3

Nov 26, 2020

3.34.2

Nov 25, 2020

3.34.0

Nov 24, 2020

3.33.3

Nov 23, 2020

3.33.1

Nov 20, 2020

3.31.0

Nov 18, 2020

3.30.0

Nov 18, 2020

3.29.0

Nov 16, 2020

3.27.3

Nov 11, 2020

3.27.2

Nov 10, 2020

3.27.1

Nov 10, 2020

3.27.0

Nov 9, 2020

3.26.0

Nov 9, 2020

3.25.2

Nov 9, 2020

3.25.1

Nov 3, 2020

3.24.1

Nov 2, 2020

3.24.0

Nov 2, 2020

3.23.6

Nov 2, 2020

3.23.4

Oct 19, 2020

3.23.3

Oct 14, 2020

3.23.2

Oct 14, 2020

3.23.1

Oct 14, 2020

3.23.0

Oct 14, 2020

3.22.1

Oct 14, 2020

3.22.0

Oct 14, 2020

3.21.0

Oct 13, 2020

3.20.0

Oct 8, 2020

3.19.2

Oct 7, 2020

3.19.1

Oct 7, 2020

3.19.0

Oct 7, 2020

3.18.1

Oct 7, 2020

3.18.0

Oct 7, 2020

3.17.0

Oct 6, 2020

3.16.0

Oct 5, 2020

3.15.0

Oct 3, 2020

3.14.0

Oct 1, 2020

3.13.0

Sep 29, 2020

3.12.0

Sep 28, 2020

3.11.1

Sep 26, 2020

3.11.0

Sep 23, 2020

3.10.0

Sep 22, 2020

3.9.1

Sep 21, 2020

0.8.0

Sep 19, 2020

0.7.2

Sep 13, 2020

0.7.1

Sep 10, 2020

0.6.1

Aug 18, 2020

0.6.0

Aug 13, 2020

0.5.1

Aug 9, 2020

0.5.0

Aug 5, 2020

0.4.5

Aug 4, 2020

This version

0.4.4

Jul 29, 2020

0.4.3

Jul 24, 2020

0.4.2

Jul 23, 2020

0.4.1

Jul 21, 2020

0.4.0

Jul 21, 2020

0.3.0

Jul 18, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

frictionless-0.4.4.tar.gz (116.6 kB view hashes)

Uploaded Jul 29, 2020 Source

Built Distribution

frictionless-0.4.4-py2.py3-none-any.whl (157.2 kB view hashes)

Uploaded Jul 29, 2020 Python 2 Python 3

Hashes for frictionless-0.4.4.tar.gz

Hashes for frictionless-0.4.4.tar.gz
Algorithm	Hash digest
SHA256	`54200971ff6f97a4ff909da7412f128615f52eda3182cc561f98a07c8335fb82`
MD5	`f03c505b84c8c15f445fb0467d2e9e59`
BLAKE2b-256	`0f1d0592d382ec9efad9796ee605b980510dc8a50d293d3a5305acef7489633e`

Hashes for frictionless-0.4.4-py2.py3-none-any.whl

Hashes for frictionless-0.4.4-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`c1c90b96800d25937acf33d217ee23022e2ea3a6808a964e5bae2aecd7f2d992`
MD5	`1176415910c309e18b06b55b24d9a610`
BLAKE2b-256	`e930016b9404acb76b3bea2297910e3b52640bdc583ab93e7ee381edad248c5f`