Skip to main content

Large-scale sparse linear classification, regression and ranking in Python

Project description

https://github.com/scikit-learn-contrib/lightning/actions/workflows/main.yml/badge.svg?branch=master https://zenodo.org/badge/DOI/10.5281/zenodo.200504.svg

lightning

lightning is a library for large-scale linear classification, regression and ranking in Python.

Highlights:

  • follows the scikit-learn API conventions

  • supports natively both dense and sparse data representations

  • computationally demanding parts implemented in Cython

Solvers supported:

  • primal coordinate descent

  • dual coordinate descent (SDCA, Prox-SDCA)

  • SGD, AdaGrad, SAG, SAGA, SVRG

  • FISTA

Example

Example that shows how to learn a multiclass classifier with group lasso penalty on the News20 dataset (c.f., Blondel et al. 2013):

from sklearn.datasets import fetch_20newsgroups_vectorized
from lightning.classification import CDClassifier

# Load News20 dataset from scikit-learn.
bunch = fetch_20newsgroups_vectorized(subset="all")
X = bunch.data
y = bunch.target

# Set classifier options.
clf = CDClassifier(penalty="l1/l2",
                   loss="squared_hinge",
                   multiclass=True,
                   max_iter=20,
                   alpha=1e-4,
                   C=1.0 / X.shape[0],
                   tol=1e-3)

# Train the model.
clf.fit(X, y)

# Accuracy
print(clf.score(X, y))

# Percentage of selected features
print(clf.n_nonzero(percentage=True))

Dependencies

lightning requires Python >= 3.6, setuptools, Joblib, Numpy >= 1.12, SciPy >= 0.19 and scikit-learn >= 0.19. Building from source also requires Cython and a working C/C++ compiler. To run the tests you will also need pytest.

Installation

Precompiled binaries for the stable version of lightning are available for the main platforms and can be installed using pip:

pip install sklearn-contrib-lightning

or conda:

conda install -c conda-forge sklearn-contrib-lightning

The development version of lightning can be installed from its git repository. In this case it is assumed that you have the git version control system, a working C++ compiler, Cython and the numpy development libraries. In order to install the development version, type:

git clone https://github.com/scikit-learn-contrib/lightning.git
cd lightning
python setup.py install

Documentation

http://contrib.scikit-learn.org/lightning/

On GitHub

https://github.com/scikit-learn-contrib/lightning

Citing

If you use this software, please cite it. Here is a BibTex snippet that you can use:

@misc{lightning_2016,
  author       = {Blondel, Mathieu and
                  Pedregosa, Fabian},
  title        = {{Lightning: large-scale linear classification,
                 regression and ranking in Python}},
  year         = 2016,
  doi          = {10.5281/zenodo.200504},
  url          = {https://doi.org/10.5281/zenodo.200504}
}

Other citing formats are available in its Zenodo entry.

Authors

  • Mathieu Blondel

  • Manoj Kumar

  • Arnaud Rachez

  • Fabian Pedregosa

  • Nikita Titov

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklearn-contrib-lightning-0.6.1.dev0.tar.gz (71.3 kB view details)

Uploaded Source

File details

Details for the file sklearn-contrib-lightning-0.6.1.dev0.tar.gz.

File metadata

  • Download URL: sklearn-contrib-lightning-0.6.1.dev0.tar.gz
  • Upload date:
  • Size: 71.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0.post20200714 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.3

File hashes

Hashes for sklearn-contrib-lightning-0.6.1.dev0.tar.gz
Algorithm Hash digest
SHA256 054f190558de372db6658b9b39ccbb7f346a03bd8bc729d6a6b6256410f6c186
MD5 f932cdb3aadcf89871fc8759eaf3262a
BLAKE2b-256 878f7aabac8006d6278fb7e09c3af2231a170704295a22853dbdedb2c21aad05

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page