|Build Status| |PyPi version| |Supported Python versions| |License| |Documentation Status|
Project description
Osprey
======
|Build Status| |PyPi version| |Supported Python versions| |License|
|Documentation Status|
osprey is an easy-to-use tool for hyperparameter optimization for
machine learning algorithms in python using scikit-learn (or using
scikit-learn compatible APIs).
Each osprey experiment combines an dataset, an estimator, a search space
(and engine), cross validation and asynchronous serialization for
distributed parallel optimization of model hyperparameters.
.. raw:: html
<p align="center">
Full documentation
.. raw:: html
</p>
Example (with `mixtape <https://github.com/rmcgibbo/mixtape>`__ models/datasets)
--------------------------------------------------------------------------------
::
$ cat config.yaml
estimator:
eval_scope: mixtape
eval: |
Pipeline([
('featurizer', DihedralFeaturizer(types=['phi', 'psi'])),
('cluster', MiniBatchKMeans()),
('msm', MarkovStateModel(n_timescales=5, verbose=False)),
])
search_space:
cluster__n_clusters:
min: 10
max: 100
type: int
featurizer__types:
choices:
- ['phi', 'psi']
- ['phi', 'psi', 'chi1']
type: enum
cv: 5
dataset_loader:
name: mdtraj
params:
trajectories: ~/local/msmbuilder/Tutorial/XTC/*/*.xtc
topology: ~/local/msmbuilder/Tutorial/native.pdb
stride: 1
trials:
uri: sqlite:///osprey-trials.db
Then run ``osprey worker``. You can run multiple parallel instances of
``osprey worker`` simultaniously on a cluster too.
::
$ osprey worker config.yaml
======================================================================
= osprey is a tool for machine learning hyperparameter optimization. =
======================================================================
osprey version: 0.2_10_g18392d9_dirty-py2.7.egg
time: October 27, 2014 10:44 PM
hostname: dn0a230538.sunet
cwd: /private/var/folders/yb/vpt17lxs67vf02qpvgvjrc5m0000gn/T/tmpDgBwlU
pid: 99407
Loading config file: config.yaml...
Loading trials database: sqlite:///osprey-trials.db (table = "trials")...
Loading dataset...
100 elements without labels
Instantiated estimator:
Pipeline(steps=[('featurizer', DihedralFeaturizer(sincos=True, types=['phi', 'psi'])), ('tica', tICA(gamma=0.05, lag_time=1, n_components=4, weighted_transform=False)), ('cluster', MiniBatchKMeans(batch_size=100, compute_labels=True, init='k-means++',
init_size=None, max_iter=100, max_no_improvement=...toff=1, lag_time=1, n_timescales=5, prior_counts=0,
reversible_type='mle', verbose=False))])
Hyperparameter search space:
featurizer__types (enum) choices = (['phi', 'psi'], ['phi', 'psi', 'chi1'])
cluster__n_clusters (int) 10 <= x <= 100
----------------------------------------------------------------------
Beginning iteration 1 / 1
----------------------------------------------------------------------
History contains: 0 trials
Choosing next hyperparameters with random...
{'cluster__n_clusters': 20, 'featurizer__types': ['phi', 'psi']}
Fitting 5 folds for each of 1 candidates, totalling 5 fits
[Parallel(n_jobs=1)]: Done 1 jobs | elapsed: 0.3s
[Parallel(n_jobs=1)]: Done 5 out of 5 | elapsed: 1.8s finished
---------------------------------
Success! Model score = 4.080646
(best score so far = 4.080646)
---------------------------------
1/1 models fit successfully.
time: October 27, 2014 10:44 PM
elapsed: 4 seconds.
osprey worker exiting.
You can dump the database to JSON or CSV with ``osprey dump``.
Installation
------------
::
# grab the latest version from github
$ pip install git+git://github.com/pandegroup/osprey.git
::
# or clone the repo yourself and run `setup.py`
$ git clone https://github.com/pandegroup/osprey.git
$ cd osprey && python setup.py install
Dependencies
------------
- ``six``
- ``pyyaml``
- ``numpy``
- ``scikit-learn``
- ``sqlalchemy``
- ``hyperopt`` (recommended, required for ``engine=hyperopt_tpe``)
- ``scipy`` (optional, for testing)
- ``nose`` (optional, for testing)
On python2.6, the ``argparse`` and ``importlib`` backports are also
required
.. |Build Status| image:: https://travis-ci.org/pandegroup/osprey.svg?branch=master
:target: https://travis-ci.org/pandegroup/osprey
.. |PyPi version| image:: https://pypip.in/v/osprey/badge.png
:target: https://pypi-hypernode.com/pypi/osprey/
.. |Supported Python versions| image:: https://pypip.in/py_versions/osprey/badge.svg
:target: https://pypi-hypernode.com/pypi/osprey/
.. |License| image:: https://pypip.in/license/osprey/badge.svg
:target: https://pypi-hypernode.com/pypi/osprey/
.. |Documentation Status| image:: https://readthedocs.org/projects/osprey/badge/?version=latest
:target: http://osprey.rtfd.org
======
|Build Status| |PyPi version| |Supported Python versions| |License|
|Documentation Status|
osprey is an easy-to-use tool for hyperparameter optimization for
machine learning algorithms in python using scikit-learn (or using
scikit-learn compatible APIs).
Each osprey experiment combines an dataset, an estimator, a search space
(and engine), cross validation and asynchronous serialization for
distributed parallel optimization of model hyperparameters.
.. raw:: html
<p align="center">
Full documentation
.. raw:: html
</p>
Example (with `mixtape <https://github.com/rmcgibbo/mixtape>`__ models/datasets)
--------------------------------------------------------------------------------
::
$ cat config.yaml
estimator:
eval_scope: mixtape
eval: |
Pipeline([
('featurizer', DihedralFeaturizer(types=['phi', 'psi'])),
('cluster', MiniBatchKMeans()),
('msm', MarkovStateModel(n_timescales=5, verbose=False)),
])
search_space:
cluster__n_clusters:
min: 10
max: 100
type: int
featurizer__types:
choices:
- ['phi', 'psi']
- ['phi', 'psi', 'chi1']
type: enum
cv: 5
dataset_loader:
name: mdtraj
params:
trajectories: ~/local/msmbuilder/Tutorial/XTC/*/*.xtc
topology: ~/local/msmbuilder/Tutorial/native.pdb
stride: 1
trials:
uri: sqlite:///osprey-trials.db
Then run ``osprey worker``. You can run multiple parallel instances of
``osprey worker`` simultaniously on a cluster too.
::
$ osprey worker config.yaml
======================================================================
= osprey is a tool for machine learning hyperparameter optimization. =
======================================================================
osprey version: 0.2_10_g18392d9_dirty-py2.7.egg
time: October 27, 2014 10:44 PM
hostname: dn0a230538.sunet
cwd: /private/var/folders/yb/vpt17lxs67vf02qpvgvjrc5m0000gn/T/tmpDgBwlU
pid: 99407
Loading config file: config.yaml...
Loading trials database: sqlite:///osprey-trials.db (table = "trials")...
Loading dataset...
100 elements without labels
Instantiated estimator:
Pipeline(steps=[('featurizer', DihedralFeaturizer(sincos=True, types=['phi', 'psi'])), ('tica', tICA(gamma=0.05, lag_time=1, n_components=4, weighted_transform=False)), ('cluster', MiniBatchKMeans(batch_size=100, compute_labels=True, init='k-means++',
init_size=None, max_iter=100, max_no_improvement=...toff=1, lag_time=1, n_timescales=5, prior_counts=0,
reversible_type='mle', verbose=False))])
Hyperparameter search space:
featurizer__types (enum) choices = (['phi', 'psi'], ['phi', 'psi', 'chi1'])
cluster__n_clusters (int) 10 <= x <= 100
----------------------------------------------------------------------
Beginning iteration 1 / 1
----------------------------------------------------------------------
History contains: 0 trials
Choosing next hyperparameters with random...
{'cluster__n_clusters': 20, 'featurizer__types': ['phi', 'psi']}
Fitting 5 folds for each of 1 candidates, totalling 5 fits
[Parallel(n_jobs=1)]: Done 1 jobs | elapsed: 0.3s
[Parallel(n_jobs=1)]: Done 5 out of 5 | elapsed: 1.8s finished
---------------------------------
Success! Model score = 4.080646
(best score so far = 4.080646)
---------------------------------
1/1 models fit successfully.
time: October 27, 2014 10:44 PM
elapsed: 4 seconds.
osprey worker exiting.
You can dump the database to JSON or CSV with ``osprey dump``.
Installation
------------
::
# grab the latest version from github
$ pip install git+git://github.com/pandegroup/osprey.git
::
# or clone the repo yourself and run `setup.py`
$ git clone https://github.com/pandegroup/osprey.git
$ cd osprey && python setup.py install
Dependencies
------------
- ``six``
- ``pyyaml``
- ``numpy``
- ``scikit-learn``
- ``sqlalchemy``
- ``hyperopt`` (recommended, required for ``engine=hyperopt_tpe``)
- ``scipy`` (optional, for testing)
- ``nose`` (optional, for testing)
On python2.6, the ``argparse`` and ``importlib`` backports are also
required
.. |Build Status| image:: https://travis-ci.org/pandegroup/osprey.svg?branch=master
:target: https://travis-ci.org/pandegroup/osprey
.. |PyPi version| image:: https://pypip.in/v/osprey/badge.png
:target: https://pypi-hypernode.com/pypi/osprey/
.. |Supported Python versions| image:: https://pypip.in/py_versions/osprey/badge.svg
:target: https://pypi-hypernode.com/pypi/osprey/
.. |License| image:: https://pypip.in/license/osprey/badge.svg
:target: https://pypi-hypernode.com/pypi/osprey/
.. |Documentation Status| image:: https://readthedocs.org/projects/osprey/badge/?version=latest
:target: http://osprey.rtfd.org
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
osprey-0.4.tar.gz
(42.3 kB
view hashes)