Skip to main content

A set of utilities for generating quality scores for MediaWiki revisions

Project description

|travis|_ |codecov|_

Revision Scoring
================
A generic, machine learning-based revision scoring system designed to be used
to automatically differentiate damage from productive contributory behavior on
Wikipedia.

Example
========

Using a scorer_model to score a revision::

>>> import mwapi
>>> from revscoring import ScorerModel
>>> from revscoring.extractors.api.extractor import Extractor
>>>
>>> with open("models/enwiki.damaging.linear_svc.model") as f:
... scorer_model = ScorerModel.load(f)
...
>>> extractor = Extractor(mwapi.Session(host="https://en.wikipedia.org",
... user_agent="revscoring demo"))
>>>
>>> feature_values = list(extractor.extract(123456789, scorer_model.features))
>>>
>>> print(scorer_model.score(feature_values))
{'prediction': True, 'probability': {False: 0.4694409344514984, True: 0.5305590655485017}}


Installation
============
The easiest way to install `revscoring` is via the Python package installer
(pip).

``pip install revscoring``

You may find that some of `revscorings` dependencies fail to compile (namely
`scipy`, `numpy` and `sklearn`). In that case, you'll need to install some
dependencies in your operating system.

Ubuntu & Debian:
Run ``sudo apt-get install python3-dev g++ gfortran liblapack-dev libopenblas-dev``
Windows:
'TODO'
MacOS:
Using Homebrew and pip, installing `revscoring` and `enchant` can be accomplished
as follows::

brew install aspell --with-all-languages
brew install enchant
pip install --no-binary pyenchant revscoring
Languages can be added to `aspell`::

cd /tmp
wget http://ftp.gnu.org/gnu/aspell/dict/pt/aspell-pt-0.50-2.tar.bz2
bzip2 -dc aspell-pt-0.50-2.tar.bz2 | tar xvf -
cd aspell-pt-0.50-2
./configure
make
sudo make install
Caveats:
* The differences between the `aspell` and `myspell` dictionaries can cause
some of the tests to fail


Finally, in order to make use of language features, you'll need to download
some NLTK data. The following command will get the necessary corpus.

``python -m nltk.downloader stopwords``

You'll also need to install `enchant <https://en.wikipedia.org/wiki/Enchant_(software)>`_ compatible
dictionaries of the languages you'd like to use. We recommend the following:

* ``languages.arabic``: aspell-ar
* ``languages.bengali``: aspell-bn
* ``languages.czech``: myspell-cs
* ``languages.dutch``: myspell-nl
* ``languages.english``: myspell-en-us myspell-en-gb myspell-en-au
* ``languages.estonian``: myspell-et
* ``languages.finnish``: voikko-fi
* ``languages.french``: myspell-fr
* ``languages.german``: myspell-de-at myspell-de-ch myspell-de-de
* ``languages.greek``: aspell-el
* ``languages.hebrew``: myspell-he
* ``languages.hungarian``: myspell-hu
* ``languages.indonesian``: aspell-id
* ``languages.italian``: myspell-it
* ``languages.norwegian``: myspell-nb
* ``languages.persian``: myspell-fa
* ``languages.polish``: aspell-pl
* ``languages.portuguese``: myspell-pt
* ``languages.spanish``: myspell-es
* ``languages.swedish``: aspell-sv
* ``languages.tamil``: aspell-ta
* ``languages.russian``: myspell-ru
* ``languages.ukrainian``: myspell-uk
* ``languages.vietnamese``: hunspell-vi

Authors
=======
Aaron Halfaker:
* `http://halfaker.info`
Helder:
* `https://github.com/he7d3r`
Adam Roses Wight:
* `https://mediawiki.org/wiki/User:Adamw`
Amir Sarabadani:
* `https://github.com/Ladsgroup`

.. |travis| image:: https://api.travis-ci.org/wiki-ai/revscoring.png
.. _travis: https://travis-ci.org/wiki-ai/revscoring
.. |codecov| image:: https://codecov.io/github/wiki-ai/revscoring/revscoring.svg
.. _codecov: https://codecov.io/github/wiki-ai/revscoring

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

revscoring-1.3.13.tar.gz (180.5 kB view details)

Uploaded Source

Built Distribution

revscoring-1.3.13-py2.py3-none-any.whl (410.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file revscoring-1.3.13.tar.gz.

File metadata

  • Download URL: revscoring-1.3.13.tar.gz
  • Upload date:
  • Size: 180.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for revscoring-1.3.13.tar.gz
Algorithm Hash digest
SHA256 418ffa47e6c4cd7bceb9ede27228c7cb482f00bd7fc2ceafcfbc4680cd8ded3f
MD5 c81b5ffbfbedcdb0ed911458566f9324
BLAKE2b-256 14dc523a581d40b8085e8ebab71b4fa0fdf7eac4c0c1fc4e2bc1127399f28b7f

See more details on using hashes here.

File details

Details for the file revscoring-1.3.13-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for revscoring-1.3.13-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 c3f57c824aa51962367399193dd64bb9a68a3dac7b060bdbfd50177a16d9ec58
MD5 e929a4383f3b326887a1ccb265f2f862
BLAKE2b-256 571f6cd320c2e446ddc63ede383bcc351443f829f4f1a417cf9f2fe5dd863e02

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page