Skip to main content

Machine Learning Toolbox

Project description

Machine Learning Toolbox 2 - MLTB2

A box of machine learning tools.

The main components are:

from mltb2.somajo import SoMaJoSentenceSplitter
Split texts into sentences. For German and English language. This is done with the SoMaJo tool.

from mltb2.transformers import TransformersTokenCounter
Count tokens made by a Transformers tokenizer.

from mltb2.somajo_transformers import TextSplitter
Split the text into sections with a specified maximum token length. Does not divide words, but always whole sentences.

from mltb2.optuna import SignificanceRepeatedTrainingPruner
An Optuna pruner to use statistical significance (a t-test which serves as a heuristic) to stop unpromising trials early, avoiding unnecessary repeated training during cross validation.

Installation

MLTB2 is available at the Python Package Index (PyPI). It can be installed with pip:

pip install mltb2

Some optional dependencies might be necessary. You can install all of them with:

pip install mltb2[optional]

Licensing

Copyright (c) 2023 Philip May
Copyright (c) 2023 Philip May, Deutsche Telekom AG

Licensed under the MIT License (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License by reviewing the file LICENSE in the repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mltb2-0.0.1.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

mltb2-0.0.1-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file mltb2-0.0.1.tar.gz.

File metadata

  • Download URL: mltb2-0.0.1.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for mltb2-0.0.1.tar.gz
Algorithm Hash digest
SHA256 bd1f3ca3a6161f600d6d1af07281e59d991edb5640a10af9c9781e54412d4a8e
MD5 886b76033aa25253441ef55e738a9828
BLAKE2b-256 e7b36fdb1a5f45f13959272939220c266d40607c4c9173e8a801ef760f1aed32

See more details on using hashes here.

File details

Details for the file mltb2-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: mltb2-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.3

File hashes

Hashes for mltb2-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1213f7610d20fc1a0e5f30a2c5192550a802f3473e3f5c65e5aee72f8903d714
MD5 a3fd7eedc1ce9d86723f3cf59ffd9e30
BLAKE2b-256 697be032bac74749dd1408fe5c68b6529a3c3f905e096dc388dc8f9940b34edd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page