Skip to main content

Machine Learning Toolbox

Project description

Machine Learning Toolbox 2 - MLTB2

MIT License Python Version pypi
pytest Static Code Checks Build & Deploy Doc GitHub issues

📦 A box of machine learning tools. 📦

Components and Documentation

Documentation Page

from mltb2.somajo import SoMaJoSentenceSplitter
Split texts into sentences. For German and English language. This is done with the SoMaJo tool.

from mltb2.somajo import JaccardSimilarity
Calculate the jaccard similarity.

from mltb2.transformers import TransformersTokenCounter
Count tokens made by a Transformers tokenizer.

from mltb2.somajo_transformers import TextSplitter
Split the text into sections with a specified maximum token length. Does not divide words, but always whole sentences.

from mltb2.optuna import SignificanceRepeatedTrainingPruner
An Optuna pruner to use statistical significance (a t-test which serves as a heuristic) to stop unpromising trials early, avoiding unnecessary repeated training during cross validation.

Installation

MLTB2 is available at the Python Package Index (PyPI). It can be installed with pip:

pip install mltb2

Some optional dependencies might be necessary. You can install all of them with:

pip install mltb2[optional]

If you don't want to install all dependencies, see the description of the individual modules.

Licensing

Copyright (c) 2023 Philip May
Copyright (c) 2023 Philip May, Deutsche Telekom AG

Licensed under the MIT License (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License by reviewing the file LICENSE in the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mltb2-0.0.7.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

mltb2-0.0.7-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file mltb2-0.0.7.tar.gz.

File metadata

  • Download URL: mltb2-0.0.7.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.17

File hashes

Hashes for mltb2-0.0.7.tar.gz
Algorithm Hash digest
SHA256 7efc82dc810c35ff22520511d41a64ae68a7e63a1d5fd2f818b6e8ec463886c4
MD5 ee087a795fbad8d35522b46a1cff6fc1
BLAKE2b-256 1f0ce96a7c8a57ce545577066264501fc358c95004b4ba9465b43deb61a11434

See more details on using hashes here.

File details

Details for the file mltb2-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: mltb2-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.17

File hashes

Hashes for mltb2-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 c9654f39df2e251ff79c42ac3e537162b2e2715a939d824d65e8c212d95d7687
MD5 ec6b6b0337b8c7811ec0c1543984c1bb
BLAKE2b-256 32f7919d2a23b54660845bcef86d6e97a640a54218e90646760245db65eadd85

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page