Skip to main content

Modern decision trees in Python

Project description

Code style: black CircleCI Main Checked with mypy codecov PyPI Download count Latest PyPI release DOI

scikit-tree

scikit-tree is a scikit-learn compatible API for building state-of-the-art decision trees. These include unsupervised trees, oblique trees, uncertainty trees, quantile trees and causal trees.

Tree-models have withstood the test of time, and are consistently used for modern-day data science and machine learning applications. They especially perform well when there are limited samples for a problem and are flexible learners that can be applied to a wide variety of different settings, such as tabular, images, time-series, genomics, EEG data and more.

Documentation

See here for the documentation for our dev version: https://docs.neurodata.io/scikit-tree/dev/index.html

Why oblique trees and why trees beyond those in scikit-learn?

In 2001, Leo Breiman proposed two types of Random Forests. One was known as Forest-RI, which is the axis-aligned traditional random forest. One was known as Forest-RC, which is the random oblique linear combinations random forest. This leveraged random combinations of features to perform splits. MORF builds upon Forest-RC by proposing additional functions to combine features. Other modern tree variants such as Canonical Correlation Forests (CCF), Extended Isolation Forests, Quantile Forests, or unsupervised random forests are also important at solving real-world problems using robust decision tree models.

Installation

Our installation will try to follow scikit-learn installation as close as possible, as we contain Cython code subclassed, or inspired by the scikit-learn tree submodule.

Dependencies

We minimally require:

* Python (>=3.9)
* numpy
* scipy
* scikit-learn >= 1.3

Installation with Pip (https://pypi-hypernode.com/project/scikit-tree/)

Installing with pip on a conda environment is the recommended route.

pip install scikit-tree

Building locally with Meson (For developers)

Make sure you have the necessary packages installed

# install build dependencies
pip install -r build_requirements.txt

# you may need these optional dependencies to build scikit-learn locally
conda install -c conda-forge joblib threadpoolctl pytest compilers llvm-openmp

We use the spin CLI to abstract away build details:

# run the build using Meson/Ninja
./spin build

# you can run the following command to see what other options there are
./spin --help
./spin build --help

# For example, you might want to start from a clean build
./spin build --clean

# or build in parallel for faster builds
./spin build -j 2

# you will need to double check the build-install has the proper path
# this might be different from machine to machine
export PYTHONPATH=${PWD}/build-install/usr/lib/python3.9/site-packages

# run specific unit tests
./spin test -- sktree/tree/tests/test_tree.py

# you can bring up the CLI menu
./spin --help

You can also do the same thing using Meson/Ninja itself. Run the following to build the local files:

# generate ninja make files
meson build --prefix=$PWD/build

# compile
ninja -C build

# install scikit-tree package
meson install -C build

export PYTHONPATH=${PWD}/build/lib/python3.9/site-packages

# to check installation, you need to be in a different directory
cd docs;  
python -c "from sktree import tree"
python -c "import sklearn; print(sklearn.__version__);"

After building locally, you can use editable installs (warning: this only registers Python changes locally)

pip install --no-build-isolation --editable .

Or if you have spin v0.8+ installed, you can just run directly

spin install

Development

We welcome contributions for modern tree-based algorithms. We use Cython to achieve fast C/C++ speeds, while abiding by a scikit-learn compatible (tested) API. Moreover, our Cython internals are easily extensible because they follow the internal Cython API of scikit-learn as well.

Due to the current state of scikit-learn's internal Cython code for trees, we have to instead leverage a fork of scikit-learn at https://github.com/neurodata/scikit-learn when extending the decision tree model API of scikit-learn. Specifically, we extend the Python and Cython API of the tree submodule in scikit-learn in our submodule, so we can introduce the tree models housed in this package. Thus these extend the functionality of decision-tree based models in a way that is not possible yet in scikit-learn itself. As one example, we introduce an abstract API to allow users to implement their own oblique splits. Our plan in the future is to benchmark these functionalities and introduce them upstream to scikit-learn where applicable and inclusion criterion are met.

References

[1]: Li, Adam, et al. "Manifold Oblique Random Forests: Towards Closing the Gap on Convolutional Deep Networks" SIAM Journal on Mathematics of Data Science, 5(1), 77-96, 2023

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikit_tree-0.8.0.tar.gz (15.4 MB view details)

Uploaded Source

Built Distributions

scikit_tree-0.8.0-cp312-cp312-win_amd64.whl (4.9 MB view details)

Uploaded CPython 3.12 Windows x86-64

scikit_tree-0.8.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.5 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

scikit_tree-0.8.0-cp312-cp312-macosx_11_0_arm64.whl (2.0 MB view details)

Uploaded CPython 3.12 macOS 11.0+ ARM64

scikit_tree-0.8.0-cp312-cp312-macosx_10_9_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.12 macOS 10.9+ x86-64

scikit_tree-0.8.0-cp311-cp311-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.11 Windows x86-64

scikit_tree-0.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

scikit_tree-0.8.0-cp311-cp311-macosx_11_0_arm64.whl (2.0 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

scikit_tree-0.8.0-cp311-cp311-macosx_10_9_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

scikit_tree-0.8.0-cp310-cp310-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.10 Windows x86-64

scikit_tree-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

scikit_tree-0.8.0-cp310-cp310-macosx_11_0_arm64.whl (2.0 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

scikit_tree-0.8.0-cp310-cp310-macosx_10_9_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

scikit_tree-0.8.0-cp39-cp39-win_amd64.whl (5.0 MB view details)

Uploaded CPython 3.9 Windows x86-64

scikit_tree-0.8.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

scikit_tree-0.8.0-cp39-cp39-macosx_11_0_arm64.whl (2.0 MB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

scikit_tree-0.8.0-cp39-cp39-macosx_10_9_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

File details

Details for the file scikit_tree-0.8.0.tar.gz.

File metadata

  • Download URL: scikit_tree-0.8.0.tar.gz
  • Upload date:
  • Size: 15.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for scikit_tree-0.8.0.tar.gz
Algorithm Hash digest
SHA256 0368b53d02c7019d78b5ded0935bd9db4a3f44c8496d91c393eb529ba4183bd1
MD5 b52337e554df0373f60f13f77eddbb9c
BLAKE2b-256 f038cace199026c94c8a822c8b3bc947977f2d8f0f06b46453a1dfbe93ca96b4

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 68a2ca36bcb0f7b99101e987224763bbd78bd1c0ed7b35b679c88c99a83036ce
MD5 ed01b0609229bf95b9aaeb42d01da1a0
BLAKE2b-256 919ffc39868f9132ce15fc626de3f1b1d69e7503e9682f6ee14947a9f707a649

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 88fa18ff4eedbe185c8331c62d72dbf019007fe13be4a9e2f3c68cf3fc070206
MD5 e134c644e599ac2a54c5c064c48e8029
BLAKE2b-256 9577b800a34ee4b1fb70c1e881a0a588c84ecf7cf894f7cc3274d4e24f83b10f

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 20fc8732293902d66c8b532705de46fa86a0ea6119ff1aec01b33cc8c51c2d29
MD5 9567fb404129c6566aa9a1e4481fd236
BLAKE2b-256 15356e417d7a51aa63921017318d68a4c0ac12d619847d01ff3147e490eb9015

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp312-cp312-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 e87d71936ead83a719e560147217960ffce8320cd8ae8cac14723189d3d082cc
MD5 cad2fe7c955dd6b8521e7028ef2d1ff8
BLAKE2b-256 6997b639e0da552767b9193dfe5b8a3d390a9b36f0657acf2b14266ae9344205

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 17adc754cfed3820300e83c07a7e258e960c19755edf40d41dcfaaac2a90c1d0
MD5 366f81b08048bb91ab1d5fa84dea25eb
BLAKE2b-256 48069c732a6ed327bc7b106eb9e538a9b399523e7245a0a3774ac303c1c47448

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5ec7d0069ef5f89890fed9874f22345a8e32355877f6efdf4453a029f32d3e0e
MD5 7b33a2cc3ced031de275e6e0d49172f7
BLAKE2b-256 014428938054c8f962b4dbfba5534797e5d24d0d31a8d612399bf019219b019a

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 03aa4deb67434cfb322e5eb6ea00c7af94374d09ccaa0d79d2f892658d54104a
MD5 4ae6e39214deade5c1bbe0730e71da41
BLAKE2b-256 5e9e55291841e0c9eb48ef0b22b628b7060cf3401eb51abf454511ed11dc802d

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 2eb7cf975d539968d436b78990328aed198251cfdfac56de4b79bd2bd0ddf5c5
MD5 679bfab36e69d7843c91fc931e7850f3
BLAKE2b-256 55867431c53a15d997552408891591a03d64ab02888198b261de4cc7b9719e6a

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 e09a47e9383ab7ea7baaeb68ba09d46a6fafd411376a27ff52e1992817985bbc
MD5 ad57eee65aae71db0d3027922fea7e67
BLAKE2b-256 e9537582d644c10e67aa6cf83e6639521166142887c0592d6a3cb4d2aeb70a10

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 75f12a80a5c688c8b370d112772bf9e7ee90d323aa59ad3ac0a0118a7028a78d
MD5 1ceb2dbcafe8a1407140a91373233d36
BLAKE2b-256 866da42d648d7e3b037c1f1d4e9bbb0a5aaaa2f183f4af3f0672bfdd7d62481d

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 eaba2f2eadab26447afd138247b0637e92224caa6d3d6d2b4098d414d0f7ccfd
MD5 b8034371c7a75665d0dc8d606866fb09
BLAKE2b-256 2ade4b797591a468e83ccdc9ba3042762c2203dbcf2df9fb31753a58cf3c7420

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 1b1b3eff463c572cf837f7fff61485345d74ae33ff0226830bc32dff339fc5da
MD5 b8625ea844bb546eab8f2df67ffa9094
BLAKE2b-256 258d714f5290cafb7c27c25764426d86c99fe3897fbee88e7ff3530716bdf79a

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 0256696d63e2eb69d5833a0e24691628de474758bb8f075527b32bfc29259fd0
MD5 d1f1e343eda6ec8edd2248773b2bdef4
BLAKE2b-256 85975aa0bff33ae8258efc497419a72d1909836f7c5c538917f8935d658f5cdd

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 92fc13bcc19ab2e0ce4e6c263182f20195b3a3e60a468da9c5e78b210d90be63
MD5 01155e4fc908ccb3b259f2de40e90795
BLAKE2b-256 45fd0086a6c2fd1c171c7a64a091072213f6518f403023c523af17aa77ccce80

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5f74d09a515c53862eb305e0432b5de1e9716a99d9d04d5f08338319d19755c2
MD5 df87f7d5a9309bda76c4e310579c336c
BLAKE2b-256 62f20a230a02ae5b4d96c419ea52bc4e603464f6240c0d8b1011a8d5d642b246

See more details on using hashes here.

File details

Details for the file scikit_tree-0.8.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for scikit_tree-0.8.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 f4d3c4a4bea30e98f492d2b40c4e6531141b04f3bc4b01889a98f90bf42b8fe3
MD5 be67ded2081743d8d9887ce2efa5e94a
BLAKE2b-256 e251654e0f5e5387631364250defd9e6bfea378c35372d413e9934c1cfc7b0f2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page