Collaborative Filtering for Implicit Datasets
Project description
Implicit
========
Fast Python Collaborative Filtering for Implicit Datasets.
----
This project provides a fast Python implementation of the algorithm decribed in the paper '`Collaborative Filtering for Implicit Feedback Datasets
<http://yifanhu.net/PUB/cf.pdf>`_'.
To install ::
pip install implicit
Basic usage ::
import implicit
user_factors, item_factors = implicit.alternating_least_squares(data, factors=50)
Requirements
----
This library requires SciPy version 0.16 or later.
Why Use This?
----
This library came about because I was looking for an efficient Python
implementation of this algorithm for a blog post I am writing.
The other `pure python implementation
<https://github.com/MrChrisJohnson/implicit-mf>`_ was much too slow on the
dataset I'm interested in: this package finishes factorizing the last.fm
dataset in about 10 minutes (50 factors, 15 iterations, 2015
macbook pro) where I estimate that the implicit-mf package would take 250 days
or so to do the same computation.
The core of this package is written in Cython, leveraging OpenMP to
parallelize computation. Linear Algebra is done using the BLAS and LAPACK
libraries distributed with SciPy. There also exists a pure python
implementation as a reference.
This library has been tested with Python 2.7 and 3.5. Running 'tox' will
run unittests on both versions, and verify that all python files pass flake8.
TODO
----
This is still a work in progress. Things immediately on the horizon:
- Example application
- Sphinx autodoc
- Test on linux, verify openmp support actually works
- Benchmark
Released under the MIT License
========
Fast Python Collaborative Filtering for Implicit Datasets.
----
This project provides a fast Python implementation of the algorithm decribed in the paper '`Collaborative Filtering for Implicit Feedback Datasets
<http://yifanhu.net/PUB/cf.pdf>`_'.
To install ::
pip install implicit
Basic usage ::
import implicit
user_factors, item_factors = implicit.alternating_least_squares(data, factors=50)
Requirements
----
This library requires SciPy version 0.16 or later.
Why Use This?
----
This library came about because I was looking for an efficient Python
implementation of this algorithm for a blog post I am writing.
The other `pure python implementation
<https://github.com/MrChrisJohnson/implicit-mf>`_ was much too slow on the
dataset I'm interested in: this package finishes factorizing the last.fm
dataset in about 10 minutes (50 factors, 15 iterations, 2015
macbook pro) where I estimate that the implicit-mf package would take 250 days
or so to do the same computation.
The core of this package is written in Cython, leveraging OpenMP to
parallelize computation. Linear Algebra is done using the BLAS and LAPACK
libraries distributed with SciPy. There also exists a pure python
implementation as a reference.
This library has been tested with Python 2.7 and 3.5. Running 'tox' will
run unittests on both versions, and verify that all python files pass flake8.
TODO
----
This is still a work in progress. Things immediately on the horizon:
- Example application
- Sphinx autodoc
- Test on linux, verify openmp support actually works
- Benchmark
Released under the MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
implicit-0.1.0.tar.gz
(4.7 kB
view details)
File details
Details for the file implicit-0.1.0.tar.gz
.
File metadata
- Download URL: implicit-0.1.0.tar.gz
- Upload date:
- Size: 4.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6a0a13776e6721bd8bf01728f1486159cf0796bc3adc027bf7319f93ed1b5f68 |
|
MD5 | f83b40675b72d624ee60e0d85583b78f |
|
BLAKE2b-256 | 84049acf6df2274d493d90dc331b1918d975217b386d5da545a083559b9797b5 |