Skip to main content

Powerful data structures for data analysis and statistics

Project description

pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. It is already well on its way toward this goal.

pandas is well suited for many different kinds of data:

  • Tabular data with heterogeneously-typed columns, as in an SQL table or Excel spreadsheet

  • Ordered and unordered (not necessarily fixed-frequency) time series data.

  • Arbitrary matrix data (homogeneously typed or heterogeneous) with row and column labels

  • Any other form of observational / statistical data sets. The data actually need not be labeled at all to be placed into a pandas data structure

The two primary data structures of pandas, Series (1-dimensional) and DataFrame (2-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. For R users, DataFrame provides everything that R’s data.frame provides and much more. pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other 3rd party libraries.

Here are just a few of the things that pandas does well:

  • Easy handling of missing data (represented as NaN) in floating point as well as non-floating point data

  • Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects

  • Automatic and explicit data alignment: objects can be explicitly aligned to a set of labels, or the user can simply ignore the labels and let Series, DataFrame, etc. automatically align the data for you in computations

  • Powerful, flexible group by functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data

  • Make it easy to convert ragged, differently-indexed data in other Python and NumPy data structures into DataFrame objects

  • Intelligent label-based slicing, fancy indexing, and subsetting of large data sets

  • Intuitive merging and joining data sets

  • Flexible reshaping and pivoting of data sets

  • Hierarchical labeling of axes (possible to have multiple labels per tick)

  • Robust IO tools for loading data from flat files (CSV and delimited), Excel files, databases, and saving / loading data from the ultrafast HDF5 format

  • Time series-specific functionality: date range generation and frequency conversion, moving window statistics, moving window linear regressions, date shifting and lagging, etc.

Many of these principles are here to address the shortcomings frequently experienced using other languages / scientific research environments. For data scientists, working with data is typically divided into multiple stages: munging and cleaning data, analyzing / modeling it, then organizing the results of the analysis into a form suitable for plotting or tabular display. pandas is the ideal tool for all of these tasks.

Note

Windows binaries built against NumPy 1.6.1

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas-0.4.1.tar.gz (1.8 MB view details)

Uploaded Source

Built Distributions

pandas-0.4.1.win-amd64-py2.7.exe (687.9 kB view details)

Uploaded Source

pandas-0.4.1.win-amd64-py2.6.exe (687.8 kB view details)

Uploaded Source

pandas-0.4.1.win32-py2.7.exe (610.3 kB view details)

Uploaded Source

pandas-0.4.1.win32-py2.6.exe (609.9 kB view details)

Uploaded Source

pandas-0.4.1.win32-py2.5.exe (475.4 kB view details)

Uploaded Source

File details

Details for the file pandas-0.4.1.tar.gz.

File metadata

  • Download URL: pandas-0.4.1.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for pandas-0.4.1.tar.gz
Algorithm Hash digest
SHA256 7208839764454466d07a994358944b55df527839be80db9c89ff28b95f2ef5a1
MD5 8ca309bfc7dcebe4721de144255cecad
BLAKE2b-256 77b2c77fff54030a80dfc0ec8cdca9d5a87820d96d949b205da4b8eb2ac6dcfa

See more details on using hashes here.

File details

Details for the file pandas-0.4.1.win-amd64-py2.7.exe.

File metadata

File hashes

Hashes for pandas-0.4.1.win-amd64-py2.7.exe
Algorithm Hash digest
SHA256 5000722f1d2b7dfc84435525a4a313727c9ed9dba3d1de1164e64e7492780017
MD5 66ed576a1c391028bc3a5cfcfab3969a
BLAKE2b-256 9fedc5a93ffea5943d7e9ccdf25d844ea8577f10e804610eb7b7723af5932938

See more details on using hashes here.

File details

Details for the file pandas-0.4.1.win-amd64-py2.6.exe.

File metadata

File hashes

Hashes for pandas-0.4.1.win-amd64-py2.6.exe
Algorithm Hash digest
SHA256 67284d9c31fc1a05fd4a8845d671d4a41e666c185db693104309cb11426c2269
MD5 090ef4228e236ac2341c9d2e3dca7154
BLAKE2b-256 1ece4b431ab30e6f8654e0eb5d7b33f71aef49a19f2e9882884b4cb0ba06fd2b

See more details on using hashes here.

File details

Details for the file pandas-0.4.1.win32-py2.7.exe.

File metadata

File hashes

Hashes for pandas-0.4.1.win32-py2.7.exe
Algorithm Hash digest
SHA256 f089533189251e06239bdd2b1922148cf9efa560f3f310055eb6d01245e456bc
MD5 18fdac088b5c58bdadefa8ff51f8f722
BLAKE2b-256 59af5c4edd42e8f8f662a6e1de883a051ea8c87c96524135899ab8741143b2ac

See more details on using hashes here.

File details

Details for the file pandas-0.4.1.win32-py2.6.exe.

File metadata

File hashes

Hashes for pandas-0.4.1.win32-py2.6.exe
Algorithm Hash digest
SHA256 4753f77f720c2e5560716ce3ad000c4247fbbcec5d5a5afc9547a9f55657666c
MD5 3ad619f225e3f82bf0627b91f1eaa2cf
BLAKE2b-256 6b303bca7113dac0fe2909e5c85706731194c8393567685040c15fccdbd40e29

See more details on using hashes here.

File details

Details for the file pandas-0.4.1.win32-py2.5.exe.

File metadata

File hashes

Hashes for pandas-0.4.1.win32-py2.5.exe
Algorithm Hash digest
SHA256 5bd9f6199aba71db877fb2f1accb86ff0e08bf8016795344b5f7962303ef4854
MD5 097c1c9e416d2c06e50a610f6ffc0b19
BLAKE2b-256 aa8f24af06c50a26442f4a6cdead21fd0d044e5c846d793da5042093cf1f7087

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page