jubakit

Jubatus Toolkit

These details have not been verified by PyPI

Project links

Homepage

Project description

jubakit: Jubatus Toolkit

jubakit is a Python module to access Jubatus features easily. jubakit can be used in conjunction with scikit-learn so that you can use powerful features like cross validation and model evaluation. See the Jubakit Documentation for the detailed description.

Currently jubakit supports Classifier, Regression, Anomaly, Recommender, NearestNeighbor, Clustering and Weight engines.

Install

pip install jubakit

Requirements

Python 2.7, 3.3, 3.4 or 3.5.
Jubatus needs to be installed.
Although not mandatory, installing scikit-learn is required to use some features like K-fold cross validation.

Quick Start

The following example shows how to perform train/classify using CSV dataset.

from jubakit.classifier import Classifier, Schema, Dataset, Config
from jubakit.loader.csv import CSVLoader

# Load a CSV file.
loader = CSVLoader('iris.csv')

# Define types for each column in the CSV file.
schema = Schema({
  'Species': Schema.LABEL,
}, Schema.NUMBER)

# Get the shuffled dataset.
dataset = Dataset(loader, schema).shuffle()

# Run the classifier service (`jubaclassifier` process).
classifier = Classifier.run(Config())

# Train the classifier.
for _ in classifier.train(dataset): pass

# Classify using the trained classifier.
for (idx, label, result) in classifier.classify(dataset):
  print("true label: {0}, estimated label: {1}".format(label, result[0][0]))

Examples by Topics

See the example directory for working examples.

Example	Topics	Requires scikit-learn
classifier_csv.py	Handling CSV file and numeric features
classifier_shogun.py	Handling CSV file and string features
classifier_digits.py	Handling toy dataset (digits)	✓
classifier_libsvm.py	Handling LIBSVM file	✓
classifier_kfold.py	K-fold cross validation and metrics	✓
classifier_parameter.py	Finding best hyper parameter	✓
classifier_hyperopt_tuning.py	Finding best hyper parameter using hyperopt	✓
classifier_bulk.py	Bulk Train-Test Classifier
classifier_twitter.py	Handling Twitter Streams
classifier_model_extract.py	Extract contents of Classfier model file
classifier_sklearn_wrapper.py	Classification using scikit-learn wrapper	✓
classifier_sklearn_grid_search.py	Grid Search example using scikit-learn wrapper	✓
regression_boston.py	Regression with toy dataset (boston)	✓
regression_csv.py	Regression with CSV file
regression_sklearn_wrapper.py	Regression using scikit-learn wrapper	✓
anomaly_auc.py	Anomaly detection and metrics
recommender_npb.py	Recommend similar items
nearest_neighbor_aaai.py	Search neighbor items
clustering_2d.py	Clustering 2-dimensional dataset
weight_shogun.py	Tracing fv_converter behavior using Weight
weight_model_extract.py	Extract contents of Weight model file

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.6.2

Jan 28, 2019

0.6.1

Oct 29, 2018

This version

0.6.0

Aug 27, 2018

0.5.5

Apr 23, 2018

0.5.4

Feb 26, 2018

0.5.3

Dec 18, 2017

0.5.2

Oct 30, 2017

0.5.1

Aug 28, 2017

0.5.0

Apr 24, 2017

0.4.2

Feb 27, 2017

0.4.1

Dec 26, 2016

0.4.0

Oct 31, 2016

0.3.0

Aug 29, 2016

0.2.2

Jul 25, 2016

0.2.1

Jun 27, 2016

0.2.0

May 30, 2016

0.1.0

Apr 25, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jubakit-0.6.0.tar.gz (55.7 kB view details)

Uploaded Aug 27, 2018 Source

File details

Details for the file jubakit-0.6.0.tar.gz.

File metadata

Download URL: jubakit-0.6.0.tar.gz
Upload date: Aug 27, 2018
Size: 55.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: Python-urllib/3.6

File hashes

Hashes for jubakit-0.6.0.tar.gz
Algorithm	Hash digest
SHA256	`0ca6e440806913a67dcd6c9e2f47f7fe580e83fc52348244910361513480f625`
MD5	`e6d0519485e17698d046a441f880201d`
BLAKE2b-256	`7ce388b7af32ce3a8a07dc45ce71729844a25acf82e8c3d7d291ece8225cc9d4`