Skip to main content

A set of utilities for generating quality scores for MediaWiki revisions

Project description

travis codecov

Revision Scoring

A generic, machine learning-based revision scoring system designed to be used to automatically differentiate damage from productive contributory behavior on Wikipedia.

Example

Using a scorer_model to score a revision:

>>> import mwapi
>>> from revscoring import ScorerModel
>>> from revscoring.extractors import APIExtractor
>>>
>>> with open("models/enwiki.damaging.linear_svc.model") as f:
...     scorer_model = ScorerModel.load(f)
...
>>> extractor = APIExtractor(mwapi.Session(host="https://en.wikipedia.org",
...                                        user_agent="revscoring demo"))
>>>
>>> feature_values = extractor.extract(123456789, scorer_model.features)
>>>
>>> print(scorer_model.score(feature_values))
{'prediction': True, 'probability': {False: 0.4694409344514984, True: 0.5305590655485017}}

Installation

The easiest way to install revscoring is via the Python package installer (pip).

pip install revscoring

You may find that some of revscorings dependencies fail to compile (namely scipy, numpy and sklearn). In that case, you’ll need to install some dependencies in your operating system.

Ubuntu & Debian:

Run sudo apt-get install python3-dev g++ gfortran liblapack-dev libopenblas-dev

Windows:

‘TODO’

MacOS:

‘TODO’

Finally, in order to make use of language features, you’ll need to download some NLTK data. The following command will get the necessary corpus.

python -m nltk.downloader stopwords

You’ll also need to install enchant compatible dictionaries of the languages you’d like to use. We recommend the following:

  • languages.arabic: aspell-ar

  • languages.dutch: myspell-nl

  • languages.english: myspell-en-us myspell-en-gb myspell-en-au

  • languages.estonian: myspell-et

  • languages.french: myspell-fr

  • languages.german: myspell-de-at myspell-de-ch myspell-de-de

  • languages.hebrew: myspell-he

  • languages.hungarian: myspell-hu

  • languages.indonesian: aspell-id

  • languages.italian: myspell-it

  • languages.persian: myspell-fa

  • languages.polish: aspell-pl

  • languages.portuguese: myspell-pt

  • languages.spanish: myspell-es

  • languages.russian: myspell-ru

  • languages.ukrainian: myspell-uk

  • languages.vietnamese: hunspell-vi

Authors

Aaron Halfaker:
  • http://halfaker.info

Helder:
  • https://github.com/he7d3r

Adam Roses Wight:
  • https://mediawiki.org/wiki/User:Adamw

Amir Sarabadani:
  • https://github.com/Ladsgroup

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

revscoring-1.2.4.tar.gz (148.9 kB view details)

Uploaded Source

Built Distribution

revscoring-1.2.4-py2.py3-none-any.whl (332.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file revscoring-1.2.4.tar.gz.

File metadata

  • Download URL: revscoring-1.2.4.tar.gz
  • Upload date:
  • Size: 148.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for revscoring-1.2.4.tar.gz
Algorithm Hash digest
SHA256 e5e647a2edd125e834b867d0fad5f4e012eb5dc4cc7e53fcb272b9fa78459a05
MD5 ea576d0ff18c2fe8d45e07f0992ce9e6
BLAKE2b-256 b910fed114e2c54b7aa301a6278ebf22fc27407bd94c306f9f850c542da750a0

See more details on using hashes here.

File details

Details for the file revscoring-1.2.4-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for revscoring-1.2.4-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 51f1bfa8fcb5a9e39913955fa06631aeb39a80c40c8b8901643d5f535925f0dc
MD5 55b9ee239289b48571f5987523abd30e
BLAKE2b-256 5b92e0866fb4244f7650751717f6f74b95f6dd11e222cb10d3d1840b42aaba4f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page