Skip to main content

Python wrapper of lightening fast Finite State machine and REgular expression manipulation library

Project description

Bling Fire Tokenizer - Open Source

Bling Fire Tokenizer is am English tokenizer designed for fast-speed tokenization for text processing in NLP. It provides lightening fast tokenization with simple APIs based on Finite State Machines.

Getting Started

To start using Bling Fire, you can build the project on Windows/Linux with CMake. For Python users, you can install the latest release using pip. pip install blingfire

Tokenization examples

Python

from blingfire import *
text = 'This is the Bling-Fire tokenizer'
output = text_to_words(text)

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Working Branch

To contribute directly to code base, you should create a personal fork and create feature branches there when you need them. This keeps the main repository clean and your personal workflow cruft out of sight.

Pull Request

Before we can accept a pull request from you, you'll need to sign a Contributor License Agreement (CLA). It is an automated process and you only need to do it once.

However, you don't have to do this up-front. You can simply clone, fork, and submit your pull-request as usual. When your pull-request is created, it is classified by a CLA bot. If the change is trivial (i.e. you just fixed a typo) then the PR is labelled with cla-not-required. Otherwise, it's classified as cla-required. In that case, the system will also tell you how you can sign the CLA. Once you have signed a CLA, the current and all future pull-requests will be labelled as cla-signed.

To enable us to quickly review and accept your pull requests, always create one pull request per issue and link the issue in the pull request if possible. Never merge multiple requests in one unless they have the same root cause. Besides, keep code changes as small as possible and avoid pure formatting changes to code that has not been modified otherwise.

Feedback

Reporting Security Issues

Security issues and bugs should be reported privately, via email, to the Microsoft Security Response Center (MSRC) at secure@microsoft.com. You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Further information, including the MSRC PGP key, can be found in the Security TechCenter.

License

Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

blingfire-0.0.7.tar.gz (159.2 kB view details)

Uploaded Source

Built Distribution

blingfire-0.0.7-py3-none-any.whl (161.7 kB view details)

Uploaded Python 3

File details

Details for the file blingfire-0.0.7.tar.gz.

File metadata

  • Download URL: blingfire-0.0.7.tar.gz
  • Upload date:
  • Size: 159.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0

File hashes

Hashes for blingfire-0.0.7.tar.gz
Algorithm Hash digest
SHA256 7294c4b5c36652488a61cac4d5796468ff89b7d35d6eb8090212006d81b71199
MD5 d798ac7220a401eacbe0de689025fac7
BLAKE2b-256 2dd23cfab957d623fcbb15a484f1d4b8c39d2df80b8e0c6220d3a58d3aad6a64

See more details on using hashes here.

File details

Details for the file blingfire-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: blingfire-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 161.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0

File hashes

Hashes for blingfire-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 a27ded769bb70684ba6250cb502a1ad46ebbb4f434015a88dd0949736235ebff
MD5 79fbdab3cdc1a0def14f2491a28e681b
BLAKE2b-256 68c8ed78813910bfc1d11ce4acf89608eceaed36e2ef0ab9d0552d863e809fab

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page