Skip to main content

Generate CHANGELOG entries out of commit messages using AI/ML techniques

Project description

Glyph uses Machine Learning and Natural Language Processing to understand commit messages. This knowledge can be used for classifying commits into categories such as Bug-fixes, Feature additions, Improvements etc.

  • Using Glyph with Kebechet, smart CHANGELOG entries out of commit messages can be generated.

  • Glyph can also be used as a standlone library for analyzing commits from a locally stored repository (see usage below)

Running this project from Git

git clone git@github.com:thoth-station/glyph.git  # or use https
cd glyph
pipenv install --dev
PYTHONPATH=. pipenv run ./thoth-glyph --help

Installing this project from PyPI

This project is available on PyPI, to install it:

pip install thoth-glyph

Features

  • Commit Classification: Singular commits can be classified using the following command:

    thoth-glyph classify -m "COMMIT MESSAGE TO BE ANALYZED"
  • Classifying Multiple Commits: Mulitple commit can be classified together using the classify-repo command. By default, this action classifies all the commits in the repository. Optionally, a date-range (YYYY-MM-DD) can be provided:

    thoth-glyph classify-repo --path /path/to/git/repo --start 2020-05-01 --end 2020-05-10
  • Classifying Using Tags: Commits can also be picked using git tags. The following command will pick commits between the tags v3.7.1 and v3.7.2

    thoth-glyph classify-repo-by-tag --path /path/to/git/repo --start_tag v3.7.1 --end_tag v3.7.2

Sample Usage

$ thoth-glyph classify -m "Fixed server bug that impacted performance"
2020-08-12 19:45:47,798 4594 WARNING  thoth.common:346: Logging to a Sentry instance is turned off
2020-08-12 19:45:47,799 4594 INFO     thoth.common:368: Logging to rsyslog endpoint is turned off
2020-08-12 19:45:47,799 4594 INFO     glyph:68: Version: 0.0.0
2020-08-12 19:45:47,800 4594 INFO     glyph:83: Classifying commit
2020-08-12 19:45:47,800 4594 INFO     thoth.glyph.models:33: Model Path : /home/tussharm/.local/lib/python3.6/site-     packages/thoth/glyph/data/model_commits_v2_quant.bin
Label : corrective
$ thoth-glyph classify-repo --path /home/tussharm/fork/glyph/ --start 2020-08-08 --end 2020-08-12
2020-08-12 19:51:26,743 4873 WARNING  thoth.common:346: Logging to a Sentry instance is turned off
2020-08-12 19:51:26,743 4873 INFO     thoth.common:368: Logging to rsyslog endpoint is turned off
2020-08-12 19:51:26,744 4873 INFO     glyph:68: Version: 0.0.0
2020-08-12 19:51:26,744 4873 INFO     glyph:100: Classifying commits in the given date-range
2020-08-12 19:51:26,749 4873 INFO     thoth.glyph.models:44: Model Path : /home/tussharm/.local/lib/python3.6/site-p    packages/thoth/glyph/data/model_commits_v2_quant.bin
2020-08-12 19:51:26,768 4873 INFO     thoth.glyph.models:52: 6 commits classified
                                           message labels_predicted
0                                 readme updated #27       perfective
1  merge pull request #1 from tushar7sharma/commi...    nonfunctional
2  merge remote-tracking branch 'upstream/master'...         features
3  grouping user-defined commit phrases (#28)* co...         features
4  commits can be collected inside user-defined g...         features
5  merge remote-tracking branch 'upstream/master'...         features

Integration with Kebechet

Kebechet can use Glyph by reading the project’s configuration from .thoth.yaml file. Glyph’s supported formatters and ML classifers can be specified in this configuration file.

  • See sample manager configuration here

  • See sample changelog generated using Glyph here

Model and Dataset

Currently Glyph ships with a model trained using Facebook’s fasttext library over a dataset of ~5000 commits collected from multiple large-scale open source projects (see referred publications for more details). The library can be easily extended to accomodate more models. Developers are welcome to contribute and improve the classification accuracy.

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thoth-glyph-0.1.2.tar.gz (5.5 MB view details)

Uploaded Source

Built Distribution

thoth_glyph-0.1.2-py3-none-any.whl (5.5 MB view details)

Uploaded Python 3

File details

Details for the file thoth-glyph-0.1.2.tar.gz.

File metadata

  • Download URL: thoth-glyph-0.1.2.tar.gz
  • Upload date:
  • Size: 5.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/39.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.8

File hashes

Hashes for thoth-glyph-0.1.2.tar.gz
Algorithm Hash digest
SHA256 23a07b3946c499bb3ad5e7fd0aea63a6ca18e8f66e777e7947612ff12f86c5bb
MD5 c47bc79947b6284e639b5d7bc4d5bbc5
BLAKE2b-256 5ebeec4d68f6a24c4a2ba617dc2270db15ce304aeb36d520702feffc74b2b5dd

See more details on using hashes here.

File details

Details for the file thoth_glyph-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: thoth_glyph-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 5.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/39.2.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.6.8

File hashes

Hashes for thoth_glyph-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d64db16973f65ec2872fe1fb68a5f9a5ebec7178d13f74ca0c751eb94dc07fca
MD5 e3424bb0d3e0e9f68d60c7952eec376a
BLAKE2b-256 23e26c7f46a524b1fc29a47337d753d919039598bb9ebc3984198230eccb6eb0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page