Skip to main content

SciKit-Learn Laboratory makes it easier to run machinelearning experiments with scikit-learn.

Project description

Build status https://img.shields.io/coveralls/EducationalTestingService/skll/stable.svg PyPI downloads Latest version on PyPI License DOI for citing SKLL 1.0.0

This Python package provides command-line utilities to make it easier to run machine learning experiments with scikit-learn. One of the primary goals of our project is to make it so that you can run scikit-learn experiments without actually needing to write any code other than what you used to generate/extract the features.

Command-line Interface

The main utility we provide is called run_experiment and it can be used to easily run a series of learners on datasets specified in a configuration file like:

[General]
experiment_name = Titanic_Evaluate_Tuned
# valid tasks: cross_validate, evaluate, predict, train
task = evaluate

[Input]
# these directories could also be absolute paths
# (and must be if you're not running things in local mode)
train_directory = train
test_directory = dev
# Can specify multiple sets of feature files that are merged together automatically
# (even across formats)
featuresets = [["family.ndj", "misc.csv", "socioeconomic.arff", "vitals.csv"]]
# List of scikit-learn learners to use
learners = ["RandomForestClassifier", "DecisionTreeClassifier", "SVC", "MultinomialNB"]
# Column in CSV containing labels to predict
label_col = Survived
# Column in CSV containing instance IDs (if any)
id_col = PassengerId

[Tuning]
# Should we tune parameters of all learners by searching provided parameter grids?
grid_search = true
# Function to maximize when performing grid search
objective = accuracy

[Output]
# again, these can/should be absolute paths
log = output
results = output
predictions = output
models = output

For more information about getting started with run_experiment, please check out our tutorial, or our config file specs.

We also provide utilities for:

Python API

If you just want to avoid writing a lot of boilerplate learning code, you can also use our simple Python API. The main way you’ll want to use the API is through the Learner and Reader classes. For more details on our API, see the documentation.

While our API can be broadly useful, it should be noted that the command-line utilities are intended as the primary way of using SKLL. The API is just a nice side-effect of our developing the utilities.

A Note on Pronunciation

SKLL logo
doc/spacer.png

SciKit-Learn Laboratory (SKLL) is pronounced “skull”: that’s where the learning happens.

Requirements

Talks

  • Simpler Machine Learning with SKLL 1.0, Dan Blanchard, PyData NYC 2014 (video | slides)

  • Simpler Machine Learning with SKLL, Dan Blanchard, PyData NYC 2013 (video | slides)

Books

SKLL is featured in Data Science at the Command Line by Jeroen Janssens.

Changelog

See GitHub releases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skll-1.1.0.tar.gz (2.7 MB view details)

Uploaded Source

Built Distribution

skll-1.1.0-py2.py3-none-any.whl (76.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file skll-1.1.0.tar.gz.

File metadata

  • Download URL: skll-1.1.0.tar.gz
  • Upload date:
  • Size: 2.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for skll-1.1.0.tar.gz
Algorithm Hash digest
SHA256 9afb07a3bffbbdde693da74fb11c21a6c7f656416111ee8d912b3d75e70b9e9b
MD5 764963f22ac62a9f109ef685511af6b4
BLAKE2b-256 e67e4d00f648233835b2f1cd091cc16bd600754d579c745d80298b6f4c13b25a

See more details on using hashes here.

File details

Details for the file skll-1.1.0-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for skll-1.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f7e7b67bbd10925c8a8ef63c97afd25fd2bccb6a3db999ef1717f52d137cf9fa
MD5 0243fd91adc6ff7f679d9412a6c22fe1
BLAKE2b-256 480995c6dae730d9a1e05a4b06f88f4561abc317d55f2ab65cb672e6866e8620

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page