Skip to main content

Load any file into a pandas DataFrame, with a minimum of configuration, and a focus on bioinformatics

Project description

dataframer

PyPI version

Tries to load any file into a pandas DataFrame, with a minimum of configuration, and a focus on bioinformatics

Examples

Typically, you’ll read a file from disk (open('my-file.txt', 'rb')), but a byte stream is simpler here.

>>> from io import BytesIO
>>> from dataframer import dataframer
>>> from pandas import set_option

>>> set_option('display.max_columns', None)

>>> bytes = b'a,b,c,z\n1,2,3,foo\n4,5,6,bar'
>>> stream = BytesIO(bytes)

Default behavior is to strip non-numeric values after the first column.

>>> df_info = dataframer.parse(stream)
>>> df_info.data_frame
   b  c
a      
1  2  3
4  5  6
>>> df_info.label_map is None
True

Alternatively, they can be preserved in place...

>>> df_info = dataframer.parse(stream, keep_strings=True)
>>> df_info.data_frame
   b  c    z
a           
1  2  3  foo
4  5  6  bar
>>> df_info.label_map is None
True

... or they can be used to compose more meaningful row labels.

>>> df_info = dataframer.parse(stream, relabel=True)
>>> df_info.data_frame
   b  c
a      
1  2  3
4  5  6
>>> df_info.label_map
{1: 'foo / 1', 4: 'bar / 4'}

Finally, the first column can also be treated as data.

>>> df_info = dataframer.parse(stream, col_zero_index=False)
>>> df_info.data_frame
   a  b  c
0  1  2  3
1  4  5  6
>>> df_info.label_map is None
True

Release process

In your branch update VERSION.txt, using semantic versioning: When the PR is merged, the successful Docker build will push a new version to pypi.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframer-0.0.2.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

dataframer-0.0.2-py2.py3-none-any.whl (4.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file dataframer-0.0.2.tar.gz.

File metadata

  • Download URL: dataframer-0.0.2.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.3

File hashes

Hashes for dataframer-0.0.2.tar.gz
Algorithm Hash digest
SHA256 e8ac5b8c17070f204f86188882a290574506b995f6430b7197b0d20cd52728b3
MD5 6b7c25a719a3621ef9abab41f83b7896
BLAKE2b-256 34620fc41543c5ac77eee55b2956bd917a79c738f4898c554b89ff8505a761b6

See more details on using hashes here.

File details

Details for the file dataframer-0.0.2-py2.py3-none-any.whl.

File metadata

  • Download URL: dataframer-0.0.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 4.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.0.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.6.3

File hashes

Hashes for dataframer-0.0.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f32bece3456a68b4c996bc056f92811a01e5699cbf3262b8da3f8a59bde02abc
MD5 8c0d994b9c1d302a41d715ec6e8c3301
BLAKE2b-256 bbeb1f4818984c463ae651099ccfda82da468f5d42cdcbeda2588f684675ffad

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page