Django Bayesian inference based comment moderation app.
Project description
Django Moderator
================
**Django community trained Bayesian inference based comment moderation app.**
.. contents:: Contents
:depth: 5
``django-moderator`` integrates Django's comments framework with SpamBayes_ to classify comments into one of four categories, *ham*, *spam*, *reported* or *unsure*, based on training by users (see Paul Graham's `A Plan for Spam <http://www.paulgraham.com/spam.html>`_ for some background).
Users classify comments as *reported* using a *report abuse* mechanic. Staff users can then classify these *reported* comments as *ham* or *spam*, thereby training the algorithm to automatically classify similarly worded comments in future. Additionally comments the algorithm fails to clearly classify as either *ham* or *spam* will be classified as *unsure*, allowing staff users to manually classify them as well via admin.
Comments classified as *spam* will have their ``is_removed`` field set to ``True`` and as such will no longer be visible in comment listings.
Comments *reported* by users will have their ``is_removed`` field set to ``True`` and as such will no longer be visible in comment listings.
Comments classified as *ham* or *unsure* will remain unchanged and as such will be visible in comment listings.
``django-moderator`` also implements a user friendly admin interface for efficiently moderating comments.
Installation
------------
#. Install or add ``django-moderator`` to your Python path.
#. Add ``moderator`` to your ``INSTALLED_APPS`` setting.
#. Configure ``django-likes`` as described `here <http://pypi.python.org/pypi/django-likes>`_.
#. Add a ``MODERATOR`` setting to your project's ``settings.py`` file. This setting specifies what classifier storage backend to use (see below) and also classification thresholds::
MODERATOR = {
'CLASSIFIER': 'moderator.storage.DjangoClassifier',
'HAM_CUTOFF': 0.3,
'SPAM_CUTOFF': 0.7,
'ABUSE_CUTOFF': 3,
}
Specifically a ``HAM_CUTOFF`` value of ``0.3`` as in this example specifies that any comment scoring less than ``0.3`` during Bayesian inference will be classified as *ham*. A ``SPAM_CUTOFF`` value of ``0.7`` as in this example specifies that any comment scoring more than ``0.7`` during Bayesian inference will be classified as *spam*. Anything between ``0.3`` and ``0.7`` will be classified as *unsure*, awaiting further manual staff user classification. Additionally an ``ABUSE_CUTOFF`` value of ``3`` as in this example specifies that any comment receiving ``3`` or more abuse reports will be classified as *reported*, awaiting further manual staff user classification. ``HAM_CUTOFF``, ``SPAM_CUTOFF`` and ``ABUSE_CUTOFF`` can be ommited in which case the default cutoffs are ``0.3``, ``0.7`` and ``3`` respectively.
#. Optionally, if you want an additional **moderate** object tool on admin change views, configure ``django-apptemplates`` as described `here <http://pypi.python.org/pypi/django-apptemplates>`_ , include ``moderator`` as an ``INSTALLED_APP`` before ``django.contrib.admin`` and add ``moderator.admin.AdminModeratorMixin`` as a base class to those admin classes you want the tool available for.
Classifier Storage Backends
---------------------------
``django-moderator`` includes two SpamBayes_ storage backends, ``moderator.storage.DjangoClassifier`` and ``moderator.storage.RedisClassifier`` respectively.
.. note::
``moderator.storage.RedisClassifier`` is recommended for production environments as it should be much faster than ``moderator.storage.DjangoClassifier``.
To use ``moderator.storage.RedisClassifier`` as your classifier storage backend specify it in your ``MODERATOR`` setting, i.e.::
MODERATOR = {
'CLASSIFIER': 'moderator.storage.RedisClassifier',
'CLASSIFIER_CONFIG': {
'host': 'localhost',
'port': 6379,
'db': 0,
'password': None,
},
'HAM_CUTOFF': 0.3,
'SPAM_CUTOFF': 0.7,
'ABUSE_CUTOFF': 3,
}
You can also create your own backends, in which case take note that the content of ``CLASSIFIER_CONFIG`` will be passed as keyword agruments to your backend's ``__init__`` method.
Usage
-----
Once correctly configured you should use the ``traincommentclassifier`` management command to train the Bayesian inference system using a sample of existing comment objects (comments with ``is_removed`` as ``True`` will be trained as *spam*, *ham* otherwise), i.e.::
$ ./manage.py traincommentclassifier
.. note::
The ``traincommentclassifier`` command will remove/clear any existing classification data and start from scratch.
Then you can periodically use the ``classifycomments`` management command to automatically classify comments as either *ham*, *spam*, *reported* or *unsure* based on user reports and previous training, i.e.::
$ ./manage.py classifycomments
Comments can be manually classified as either *ham* or *spam* via admin list view actions.
.. _SpamBayes: http://spambayes.sourceforge.net/
Authors
=======
Praekelt Foundation
-------------------
* Shaun Sephton
Changelog
=========
0.0.7 (2013-01-28)
------------------
#. Added moderate admin change view tool.
0.0.6 (2013-01-24)
------------------
#. Added site field for canned replies and filter accordingly on comment admin views.
0.0.5 (2012-12-03)
------------------
#. Added ``traincommentclassifier`` management command.
#. Admin proxy model additions to clearly group comments.
#. Various optimizations.
0.0.4 (2012-08-29)
------------------
#. Migration to add moderator_commentreply model.
0.0.3 (2012-08-29)
------------------
#. Include templates.
0.0.2 (2012-08-29)
------------------
#. Wide range of changes allowing for reporting of abusive comments by users.
0.0.1 (2012-05-23)
------------------
#. Initial release
================
**Django community trained Bayesian inference based comment moderation app.**
.. contents:: Contents
:depth: 5
``django-moderator`` integrates Django's comments framework with SpamBayes_ to classify comments into one of four categories, *ham*, *spam*, *reported* or *unsure*, based on training by users (see Paul Graham's `A Plan for Spam <http://www.paulgraham.com/spam.html>`_ for some background).
Users classify comments as *reported* using a *report abuse* mechanic. Staff users can then classify these *reported* comments as *ham* or *spam*, thereby training the algorithm to automatically classify similarly worded comments in future. Additionally comments the algorithm fails to clearly classify as either *ham* or *spam* will be classified as *unsure*, allowing staff users to manually classify them as well via admin.
Comments classified as *spam* will have their ``is_removed`` field set to ``True`` and as such will no longer be visible in comment listings.
Comments *reported* by users will have their ``is_removed`` field set to ``True`` and as such will no longer be visible in comment listings.
Comments classified as *ham* or *unsure* will remain unchanged and as such will be visible in comment listings.
``django-moderator`` also implements a user friendly admin interface for efficiently moderating comments.
Installation
------------
#. Install or add ``django-moderator`` to your Python path.
#. Add ``moderator`` to your ``INSTALLED_APPS`` setting.
#. Configure ``django-likes`` as described `here <http://pypi.python.org/pypi/django-likes>`_.
#. Add a ``MODERATOR`` setting to your project's ``settings.py`` file. This setting specifies what classifier storage backend to use (see below) and also classification thresholds::
MODERATOR = {
'CLASSIFIER': 'moderator.storage.DjangoClassifier',
'HAM_CUTOFF': 0.3,
'SPAM_CUTOFF': 0.7,
'ABUSE_CUTOFF': 3,
}
Specifically a ``HAM_CUTOFF`` value of ``0.3`` as in this example specifies that any comment scoring less than ``0.3`` during Bayesian inference will be classified as *ham*. A ``SPAM_CUTOFF`` value of ``0.7`` as in this example specifies that any comment scoring more than ``0.7`` during Bayesian inference will be classified as *spam*. Anything between ``0.3`` and ``0.7`` will be classified as *unsure*, awaiting further manual staff user classification. Additionally an ``ABUSE_CUTOFF`` value of ``3`` as in this example specifies that any comment receiving ``3`` or more abuse reports will be classified as *reported*, awaiting further manual staff user classification. ``HAM_CUTOFF``, ``SPAM_CUTOFF`` and ``ABUSE_CUTOFF`` can be ommited in which case the default cutoffs are ``0.3``, ``0.7`` and ``3`` respectively.
#. Optionally, if you want an additional **moderate** object tool on admin change views, configure ``django-apptemplates`` as described `here <http://pypi.python.org/pypi/django-apptemplates>`_ , include ``moderator`` as an ``INSTALLED_APP`` before ``django.contrib.admin`` and add ``moderator.admin.AdminModeratorMixin`` as a base class to those admin classes you want the tool available for.
Classifier Storage Backends
---------------------------
``django-moderator`` includes two SpamBayes_ storage backends, ``moderator.storage.DjangoClassifier`` and ``moderator.storage.RedisClassifier`` respectively.
.. note::
``moderator.storage.RedisClassifier`` is recommended for production environments as it should be much faster than ``moderator.storage.DjangoClassifier``.
To use ``moderator.storage.RedisClassifier`` as your classifier storage backend specify it in your ``MODERATOR`` setting, i.e.::
MODERATOR = {
'CLASSIFIER': 'moderator.storage.RedisClassifier',
'CLASSIFIER_CONFIG': {
'host': 'localhost',
'port': 6379,
'db': 0,
'password': None,
},
'HAM_CUTOFF': 0.3,
'SPAM_CUTOFF': 0.7,
'ABUSE_CUTOFF': 3,
}
You can also create your own backends, in which case take note that the content of ``CLASSIFIER_CONFIG`` will be passed as keyword agruments to your backend's ``__init__`` method.
Usage
-----
Once correctly configured you should use the ``traincommentclassifier`` management command to train the Bayesian inference system using a sample of existing comment objects (comments with ``is_removed`` as ``True`` will be trained as *spam*, *ham* otherwise), i.e.::
$ ./manage.py traincommentclassifier
.. note::
The ``traincommentclassifier`` command will remove/clear any existing classification data and start from scratch.
Then you can periodically use the ``classifycomments`` management command to automatically classify comments as either *ham*, *spam*, *reported* or *unsure* based on user reports and previous training, i.e.::
$ ./manage.py classifycomments
Comments can be manually classified as either *ham* or *spam* via admin list view actions.
.. _SpamBayes: http://spambayes.sourceforge.net/
Authors
=======
Praekelt Foundation
-------------------
* Shaun Sephton
Changelog
=========
0.0.7 (2013-01-28)
------------------
#. Added moderate admin change view tool.
0.0.6 (2013-01-24)
------------------
#. Added site field for canned replies and filter accordingly on comment admin views.
0.0.5 (2012-12-03)
------------------
#. Added ``traincommentclassifier`` management command.
#. Admin proxy model additions to clearly group comments.
#. Various optimizations.
0.0.4 (2012-08-29)
------------------
#. Migration to add moderator_commentreply model.
0.0.3 (2012-08-29)
------------------
#. Include templates.
0.0.2 (2012-08-29)
------------------
#. Wide range of changes allowing for reporting of abusive comments by users.
0.0.1 (2012-05-23)
------------------
#. Initial release
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
django-moderator-0.0.7.tar.gz
(18.9 kB
view details)
Built Distribution
django_moderator-0.0.7-py2.7.egg
(52.6 kB
view details)
File details
Details for the file django-moderator-0.0.7.tar.gz
.
File metadata
- Download URL: django-moderator-0.0.7.tar.gz
- Upload date:
- Size: 18.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3943dadb3c8ab9eb086513806e13219fb4115d303370b82ed8e2c859ab1fc662 |
|
MD5 | 6ac539fe41658dc8f82b256791e46d41 |
|
BLAKE2b-256 | 5ec2c7aeab2861d530e94442d8fe26c836c37522af5bd560e00eea687db32c22 |
Provenance
File details
Details for the file django_moderator-0.0.7-py2.7.egg
.
File metadata
- Download URL: django_moderator-0.0.7-py2.7.egg
- Upload date:
- Size: 52.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3075ea1cd8577e5523f9c6f83af7d7078e022c8e6c4ff3e8c8978ed85a233f72 |
|
MD5 | 8d82bb47b06c0ee0af0735a54a715ee3 |
|
BLAKE2b-256 | 2daf8bc3996e0be9a3a71aa50f17c3da98105af6cf14d5d903c46c505d904a6d |