Skip to main content

Array (and numpy) API for ONNX

Project description

Build Status Windows https://dl.circleci.com/status-badge/img/gh/sdpython/pandas-streaming/tree/main.svg?style=svg https://dev.azure.com/xavierdupre3/pandas_streaming/_apis/build/status/sdpython.pandas_streaming https://badge.fury.io/py/pandas_streaming.svg MIT License https://codecov.io/gh/sdpython/pandas-streaming/branch/main/graph/badge.svg?token=0caHX1rhr8 GitHub Issues Downloads Forks Stars size

pandas-streaming aims at processing big files with pandas, too big to hold in memory, too small to be parallelized with a significant gain. The module replicates a subset of pandas API and implements other functionalities for machine learning.

from pandas_streaming.df import StreamingDataFrame
sdf = StreamingDataFrame.read_csv("filename", sep="\t", encoding="utf-8")

for df in sdf:
    # process this chunk of data
    # df is a dataframe
    print(df)

The module can also stream an existing dataframe.

import pandas
df = pandas.DataFrame([dict(cf=0, cint=0, cstr="0"),
                       dict(cf=1, cint=1, cstr="1"),
                       dict(cf=3, cint=3, cstr="3")])

from pandas_streaming.df import StreamingDataFrame
sdf = StreamingDataFrame.read_df(df)

for df in sdf:
    # process this chunk of data
    # df is a dataframe
    print(df)

It contains other helpers to split datasets into train and test with some weird constraints.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas-streaming-0.5.0.tar.gz (34.1 kB view details)

Uploaded Source

Built Distribution

pandas_streaming-0.5.0-py3-none-any.whl (36.6 kB view details)

Uploaded Python 3

File details

Details for the file pandas-streaming-0.5.0.tar.gz.

File metadata

  • Download URL: pandas-streaming-0.5.0.tar.gz
  • Upload date:
  • Size: 34.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for pandas-streaming-0.5.0.tar.gz
Algorithm Hash digest
SHA256 5693cd930d0b833aef5d2aa7873528a8fbe60b2f4575fe65499a2a05fc57381f
MD5 b0428843b387193bd50e7b5f40eacfbe
BLAKE2b-256 21f328a70d24df490849b5c4c93deacb3fb6674e928834a63f86edb05e071e5b

See more details on using hashes here.

File details

Details for the file pandas_streaming-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pandas_streaming-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a6ded7b7cc8f87a45e63c581bdc796fd37981182dbf3229b74e80b20385c5ba6
MD5 ea4f7fb97a23cfd455bfe3a8e0703a0a
BLAKE2b-256 0ae2fd3184612f13a4acbc1daf661a544118806a1b640b7561ba18a7928f243c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page