Statistical computations and models for use with SciPy
Project description
What it is
Statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics and estimation and inference for statistical models.
Main Features
linear regression models: Generalized least squares (including weighted least squares and least squares with autoregressive errors), ordinary least squares.
glm: Generalized linear models with support for all of the one-parameter exponential family distributions.
discrete: regression with discrete dependent variables, including Logit, Probit, MNLogit, Poisson, based on maximum likelihood estimators
rlm: Robust linear models with support for several M-estimators.
tsa: models for time series analysis - univariate time series analysis: AR, ARIMA - vector autoregressive models, VAR and structural VAR - descriptive statistics and process models for time series analysis
nonparametric : (Univariate) kernel density estimators
datasets: Datasets to be distributed and used for examples and in testing.
stats: a wide range of statistical tests - diagnostics and specification tests - goodness-of-fit and normality tests - functions for multiple testing - various additional statistical tests
iolib - Tools for reading Stata .dta files into numpy arrays. - printing table output to ascii, latex, and html
miscellaneous models
sandbox: statsmodels contains a sandbox folder with code in various stages of developement and testing which is not considered “production ready”. This covers among others Mixed (repeated measures) Models, GARCH models, general method of moments (GMM) estimators, kernel regression, various extensions to scipy.stats.distributions, panel data models, generalized additive models and information theoretic measures.
Where to get it
The master branch on GitHub is the most up to date code
Source download of release tags are available on GitHub
Binaries and source distributions are available from PyPi
Installation from sources
See INSTALL.txt for requirements or see the documentation
License
Modified BSD (3-clause)
Documentation
The official documentation is hosted on SourceForge
Windows Help
We are providing a Windows htmlhelp file (statsmodels.chm) that is now separately distributed, available at http://sourceforge.net/projects/statsmodels/files/statsmodels-0.4.3/statsmodelsdoc.zip/download
It can be copied or moved to the installation directory of statsmodels (site-packagesstatsmodels in a typical installation), and can then be opened from the python interpreter
>>> import statsmodels.api as sm >>> sm.open_help()
Discussion and Development
Discussions take place on our mailing list.
We are very interested in feedback about usability and suggestions for improvements.
Bug Reports
Bug reports can be submitted to the issue tracker at
Release History
0.4.3
The only change compared to 0.4.2 is for compatibility with python 3.2.3 (changed behavior of 2to3).
0.4.2
This is a bug-fix release that affects mainly Big-Endian machines.
Bug Fixes
discrete_model.MNLogit: fix summary method
examples in documentation: correct file path
tsa.filters.hp_filter: don’t use umfpack on Big-Endian machine (scipy bug)
the remaining fixes are in the test suite, either precision problems on some machines or incorrect testing on Big-Endian machines.
0.4.1
This is a backwards compatible (according to our test suite) release with bug fixes and code cleanup.
Bug Fixes
build and distribution fixes
lowess correct distance calculation
genmod correction CDFlink derivative
adfuller _autolag correct calculation of optimal lag
het_arch, het_lm : fix autolag and store options
GLSAR: incorrect whitening for lag>1
Other Changes
add lowess and other functions to api and documentation
rename lowess module (old import path will be removed at next release)
new robust sandwich covariance estimators, moved out of sandbox
compatibility with pandas 0.8
new plots in statsmodels.graphics - ABLine plot - interaction plot
0.4.0
Main Changes and Additions
Added pandas dependency.
Cython source is built automatically if cython and compiler are present
Support use of dates in timeseries models
Improved plots - Violin plots - Bean Plots - QQ Plots
Added lowess function
Support for pandas Series and DataFrame objects. Results instances return pandas objects if the models are fit using pandas objects.
Full Python 3 compatibility
Fix bugs in genfromdta. Convert Stata .dta format to structured array preserving all types. Conversion is much faster now.
Improved documentation
Models and results are pickleable via save/load, optionally saving the model data.
Kernel Density Estimation now uses Cython and is considerably faster.
Diagnostics for outlier and influence statistics in OLS
Added El Nino Sea Surface Temperatures dataset
Numerous bug fixes
Internal code refactoring
Improved documentation including examples as part of HTML
Changes that break backwards compatibility
Deprecated scikits namespace. The recommended import is now:
import statsmodels.api as sm
model.predict methods signature is now (params, exog, …) where before it assumed that the model had been fit and omitted the params argument.
For consistency with other multi-equation models, the parameters of MNLogit are now transposed.
tools.tools.ECDF -> distributions.ECDF
tools.tools.monotone_fn_inverter -> distributions.monotone_fn_inverter
tools.tools.StepFunction -> distributions.StepFunction
0.3.1
Removed academic-only WFS dataset.
Fix easy_install issue on Windows.
0.3.0
Changes that break backwards compatibility
Added api.py for importing. So the new convention for importing is:
import statsmodels.api as sm
Importing from modules directly now avoids unnecessary imports and increases the import speed if a library or user only needs specific functions.
sandbox/output.py -> iolib/table.py
lib/io.py -> iolib/foreign.py (Now contains Stata .dta format reader)
family -> families
families.links.inverse -> families.links.inverse_power
Datasets’ Load class is now load function.
regression.py -> regression/linear_model.py
discretemod.py -> discrete/discrete_model.py
rlm.py -> robust/robust_linear_model.py
glm.py -> genmod/generalized_linear_model.py
model.py -> base/model.py
t() method -> tvalues attribute (t() still exists but raises a warning)
Main changes and additions
Numerous bugfixes.
Time Series Analysis model (tsa)
Vector Autoregression Models VAR (tsa.VAR)
Autogressive Models AR (tsa.AR)
Autoregressive Moving Average Models ARMA (tsa.ARMA) optionally uses Cython for Kalman Filtering use setup.py install with option –with-cython
Baxter-King band-pass filter (tsa.filters.bkfilter)
Hodrick-Prescott filter (tsa.filters.hpfilter)
Christiano-Fitzgerald filter (tsa.filters.cffilter)
Improved maximum likelihood framework uses all available scipy.optimize solvers
Refactor of the datasets sub-package.
Added more datasets for examples.
Removed RPy dependency for running the test suite.
Refactored the test suite.
Refactored codebase/directory structure.
Support for offset and exposure in GLM.
Removed data_weights argument to GLM.fit for Binomial models.
New statistical tests, especially diagnostic and specification tests
Multiple test correction
General Method of Moment framework in sandbox
Improved documentation
and other additions
0.2.0
Main changes
renames for more consistency RLM.fitted_values -> RLM.fittedvalues GLMResults.resid_dev -> GLMResults.resid_deviance
GLMResults, RegressionResults: lazy calculations, convert attributes to properties with _cache
fix tests to run without rpy
expanded examples in examples directory
add PyDTA to lib.io – functions for reading Stata .dta binary files and converting them to numpy arrays
made tools.categorical much more robust
add_constant now takes a prepend argument
fix GLS to work with only a one column design
New
add four new datasets
A dataset from the American National Election Studies (1996)
Grunfeld (1950) investment data
Spector and Mazzeo (1980) program effectiveness data
A US macroeconomic dataset
add four new Maximum Likelihood Estimators for models with a discrete dependent variables with examples
Logit
Probit
MNLogit (multinomial logit)
Poisson
Sandbox
add qqplot in sandbox.graphics
add sandbox.tsa (time series analysis) and sandbox.regression (anova)
add principal component analysis in sandbox.tools
add Seemingly Unrelated Regression (SUR) and Two-Stage Least Squares for systems of equations in sandbox.sysreg.Sem2SLS
add restricted least squares (RLS)
0.1.0b1
initial release
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
File details
Details for the file statsmodels-0.4.3.zip
.
File metadata
- Download URL: statsmodels-0.4.3.zip
- Upload date:
- Size: 4.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c5de6e55d0341269f4b0f385e70d613822dba64d71fc9610c8ed7102bdc5afb8 |
|
MD5 | 97f7e4c1b9870d6f783359f6d1774437 |
|
BLAKE2b-256 | cb3d2c95d055582795d178aa3b88259d99a3c1b2d5ef1c2b707c8445e1be5b21 |
File details
Details for the file statsmodels-0.4.3.tar.gz
.
File metadata
- Download URL: statsmodels-0.4.3.tar.gz
- Upload date:
- Size: 4.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 504a4f6ccb657c1fab21c6cea5bc53a698bebf72f226dbf0f13374f7f371a7d4 |
|
MD5 | eee727c2fa4e3d884f1baaae7ae3d58c |
|
BLAKE2b-256 | c6f2fdbcb500d078165757496e590f395ef610772c98869566d767554b1deb08 |
File details
Details for the file statsmodels-0.4.3.win-amd64-py3.2.exe
.
File metadata
- Download URL: statsmodels-0.4.3.win-amd64-py3.2.exe
- Upload date:
- Size: 3.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 304cf3e07c248bd8fc189a69c7ad885b2fdeb364e80c26be6a99b17fc5c1ac0e |
|
MD5 | a4498816016e714e76a5f3cf9d780ea7 |
|
BLAKE2b-256 | 509c75a7892ce1a96477a2170ebfe5e5f295633c54a1087ec266be44902bd634 |
File details
Details for the file statsmodels-0.4.3.win-amd64-py2.7.exe
.
File metadata
- Download URL: statsmodels-0.4.3.win-amd64-py2.7.exe
- Upload date:
- Size: 3.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ae1df587344587edd6dde31cab1c4d2cfeed1a01706e712284e97c5c12e0f925 |
|
MD5 | 5b6acf2309868aabbc33712f97a2fa74 |
|
BLAKE2b-256 | 34ed228b443d9456c7085c4eee71afc304417b60d80b4b60735d9dd5d3de70ca |
File details
Details for the file statsmodels-0.4.3.win-amd64-py2.6.exe
.
File metadata
- Download URL: statsmodels-0.4.3.win-amd64-py2.6.exe
- Upload date:
- Size: 3.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf7d68b9adebc437531e52541f169f843d9899504eb47b30af4d615745324123 |
|
MD5 | 1c88ce6c7446ce25176b5876d373178c |
|
BLAKE2b-256 | 180eb34dbd571f908b2fa1041845e1f16bf61771a5b4a7f65e720b3e4052ecd4 |
File details
Details for the file statsmodels-0.4.3.win32-py3.2.exe
.
File metadata
- Download URL: statsmodels-0.4.3.win32-py3.2.exe
- Upload date:
- Size: 3.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f86e06edd36b7039aa28358dc1982a1fe3699d844a2777389ce834e93cdf068e |
|
MD5 | 6f9b2d02df506269cffdeb1b22d7a62f |
|
BLAKE2b-256 | 6f51e48abad852d062fa76fdf594fe9e41a95fe168d0243e6f5b152eadc19063 |
File details
Details for the file statsmodels-0.4.3.win32-py2.7.exe
.
File metadata
- Download URL: statsmodels-0.4.3.win32-py2.7.exe
- Upload date:
- Size: 3.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e3010eece53871a9a9ce32b276dc6f8a114b1cff2d22aa8f49d6aa2518181146 |
|
MD5 | b4ba37e77ca6106b2b48286cbf4221e2 |
|
BLAKE2b-256 | 0fce9c23107623309bc7d13a8c45f1dd09074ae21111f03bf5190f8c6551a33f |
File details
Details for the file statsmodels-0.4.3.win32-py2.6.exe
.
File metadata
- Download URL: statsmodels-0.4.3.win32-py2.6.exe
- Upload date:
- Size: 3.5 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 573a3ab5cd3e80d07521b6ba7dcc85ca92f0cc3506da8a923804f0d946abed22 |
|
MD5 | de33b0d7fc2d3c1072680fe0672786f2 |
|
BLAKE2b-256 | f13de2fbe80b5170941cb4399ec4a882397e7ef77cfcc5c58a82173a296eca34 |