Django Bayesian inference based comment moderation app.
Project description
Django Moderator
================
**Django community trained Bayesian inference based comment moderation app.**
.. contents:: Contents
:depth: 5
``django-moderator`` integrates Django's comments framework with SpamBayes_ to classify comments into one of four categories, *ham*, *spam*, *reported* or *unsure*, based on training by users (see Paul Graham's `A Plan for Spam <http://www.paulgraham.com/spam.html>`_ for some background).
Users classify comments as *reported* using a *report abuse* mechanic. Staff users can then classify these *reported* comments as *ham* or *spam*, thereby training the algorithm to automatically classify similarly worded comments in future. Additionally comments the algorithm fails to clearly classify as either *ham* or *spam* will be classified as *unsure*, allowing staff users to manually classify them as well via admin.
Comments classified as *spam* will have their ``is_removed`` field set to ``True`` and as such will no longer be visible in comment listings.
Comments *reported* by users will have their ``is_removed`` field set to ``True`` and as such will no longer be visible in comment listings.
Comments classified as *ham* or *unsure* will remain unchanged and as such will be visible in comment listings.
``django-moderator`` also implements a user friendly admin interface for efficiently moderating comments.
Installation
------------
#. Install or add ``django-moderator`` to your Python path.
#. Add ``moderator`` to your ``INSTALLED_APPS`` setting.
#. Configure ``django-likes`` as described `here <http://pypi.python.org/pypi/django-likes>`_.
#. Add a ``MODERATOR`` setting to your project's ``settings.py`` file. This setting specifies what classifier storage backend to use (see below) and also classification thresholds::
MODERATOR = {
'CLASSIFIER': 'moderator.storage.DjangoClassifier',
'HAM_CUTOFF': 0.3,
'SPAM_CUTOFF': 0.7,
'ABUSE_CUTOFF': 3,
}
Specifically a ``HAM_CUTOFF`` value of ``0.3`` as in this example specifies that any comment scoring less than ``0.3`` during Bayesian inference will be classified as *ham*. A ``SPAM_CUTOFF`` value of ``0.7`` as in this example specifies that any comment scoring more than ``0.7`` during Bayesian inference will be classified as *spam*. Anything between ``0.3`` and ``0.7`` will be classified as *unsure*, awaiting further manual staff user classification. Additionally an ``ABUSE_CUTOFF`` value of ``3`` as in this example specifies that any comment receiving ``3`` or more abuse reports will be classified as *reported*, awaiting further manual staff user classification. ``HAM_CUTOFF``, ``SPAM_CUTOFF`` and ``ABUSE_CUTOFF`` can be ommited in which case the default cutoffs are ``0.3``, ``0.7`` and ``3`` respectively.
#. Optionally, if you want an additional **moderate** object tool on admin change views, configure ``django-apptemplates`` as described `here <http://pypi.python.org/pypi/django-apptemplates>`_ , include ``moderator`` as an ``INSTALLED_APP`` before ``django.contrib.admin`` and add ``moderator.admin.AdminModeratorMixin`` as a base class to those admin classes you want the tool available for.
Additional Settings
-------------------
#. By default all comments are classifed as they are created. You can however disable this behaviour by specifying ``REALTIME_CLASSIFICATION`` as ``False``, i.e.::
MODERATOR = {
...
'REALTIME_CLASSIFICATION': False,
...
}
#. By default moderator comment replies are posted chronologically **after** the comment being replied to. If however you need replies to be posted **before** the comment being replied to(for example if you display your comments reverse cronologically), you can specify ``REPLY_BEFORE_COMMENT`` as ``True``, i.e.::
MODERATOR = {
...
'REPLY_BEFORE_COMMENT': True,
...
}
Classifier Storage Backends
---------------------------
``django-moderator`` includes two SpamBayes_ storage backends, ``moderator.storage.DjangoClassifier`` and ``moderator.storage.RedisClassifier`` respectively.
.. note::
``moderator.storage.RedisClassifier`` is recommended for production environments as it should be much faster than ``moderator.storage.DjangoClassifier``.
To use ``moderator.storage.RedisClassifier`` as your classifier storage backend specify it in your ``MODERATOR`` setting, i.e.::
MODERATOR = {
'CLASSIFIER': 'moderator.storage.RedisClassifier',
'CLASSIFIER_CONFIG': {
'host': 'localhost',
'port': 6379,
'db': 0,
'password': None,
},
'HAM_CUTOFF': 0.3,
'SPAM_CUTOFF': 0.7,
'ABUSE_CUTOFF': 3,
}
You can also create your own backends, in which case take note that the content of ``CLASSIFIER_CONFIG`` will be passed as keyword agruments to your backend's ``__init__`` method.
Usage
-----
Once correctly configured you should use the ``traincommentclassifier`` management command to train the Bayesian inference system using a sample of existing comment objects (comments with ``is_removed`` as ``True`` will be trained as *spam*, *ham* otherwise), i.e.::
$ ./manage.py traincommentclassifier
.. note::
The ``traincommentclassifier`` command will remove/clear any existing classification data and start from scratch.
Then you can periodically use the ``classifycomments`` management command to automatically classify comments as either *ham*, *spam*, *reported* or *unsure* based on user reports and previous training, i.e.::
$ ./manage.py classifycomments
Comments can be manually classified as either *ham* or *spam* via admin list view actions.
.. _SpamBayes: http://spambayes.sourceforge.net/
Authors
=======
Praekelt Foundation
-------------------
* Shaun Sephton
Changelog
=========
0.1.3 (2013-03-07)
------------------
#. Include fixtures.
0.1.2 (2013-03-07)
------------------
#. Include fixtures.
0.1.1 (2013-03-07)
------------------
#. Added elivated abuse reporting functionality.
0.1.0 (2013-03-07)
------------------
#. Realtime classification option.
#. Mark spam with reply action.
#. Post replies before comment option.
0.0.9 (2013-02-18)
------------------
#. Further speed optimizations.
0.0.8 (2013-02-18)
------------------
#. Admin speed optimizations.
#. Add moderator reply admin action.
0.0.7 (2013-01-28)
------------------
#. Added moderate admin change view tool.
0.0.6 (2013-01-24)
------------------
#. Added site field for canned replies and filter accordingly on comment admin views.
0.0.5 (2012-12-03)
------------------
#. Added ``traincommentclassifier`` management command.
#. Admin proxy model additions to clearly group comments.
#. Various optimizations.
0.0.4 (2012-08-29)
------------------
#. Migration to add moderator_commentreply model.
0.0.3 (2012-08-29)
------------------
#. Include templates.
0.0.2 (2012-08-29)
------------------
#. Wide range of changes allowing for reporting of abusive comments by users.
0.0.1 (2012-05-23)
------------------
#. Initial release
================
**Django community trained Bayesian inference based comment moderation app.**
.. contents:: Contents
:depth: 5
``django-moderator`` integrates Django's comments framework with SpamBayes_ to classify comments into one of four categories, *ham*, *spam*, *reported* or *unsure*, based on training by users (see Paul Graham's `A Plan for Spam <http://www.paulgraham.com/spam.html>`_ for some background).
Users classify comments as *reported* using a *report abuse* mechanic. Staff users can then classify these *reported* comments as *ham* or *spam*, thereby training the algorithm to automatically classify similarly worded comments in future. Additionally comments the algorithm fails to clearly classify as either *ham* or *spam* will be classified as *unsure*, allowing staff users to manually classify them as well via admin.
Comments classified as *spam* will have their ``is_removed`` field set to ``True`` and as such will no longer be visible in comment listings.
Comments *reported* by users will have their ``is_removed`` field set to ``True`` and as such will no longer be visible in comment listings.
Comments classified as *ham* or *unsure* will remain unchanged and as such will be visible in comment listings.
``django-moderator`` also implements a user friendly admin interface for efficiently moderating comments.
Installation
------------
#. Install or add ``django-moderator`` to your Python path.
#. Add ``moderator`` to your ``INSTALLED_APPS`` setting.
#. Configure ``django-likes`` as described `here <http://pypi.python.org/pypi/django-likes>`_.
#. Add a ``MODERATOR`` setting to your project's ``settings.py`` file. This setting specifies what classifier storage backend to use (see below) and also classification thresholds::
MODERATOR = {
'CLASSIFIER': 'moderator.storage.DjangoClassifier',
'HAM_CUTOFF': 0.3,
'SPAM_CUTOFF': 0.7,
'ABUSE_CUTOFF': 3,
}
Specifically a ``HAM_CUTOFF`` value of ``0.3`` as in this example specifies that any comment scoring less than ``0.3`` during Bayesian inference will be classified as *ham*. A ``SPAM_CUTOFF`` value of ``0.7`` as in this example specifies that any comment scoring more than ``0.7`` during Bayesian inference will be classified as *spam*. Anything between ``0.3`` and ``0.7`` will be classified as *unsure*, awaiting further manual staff user classification. Additionally an ``ABUSE_CUTOFF`` value of ``3`` as in this example specifies that any comment receiving ``3`` or more abuse reports will be classified as *reported*, awaiting further manual staff user classification. ``HAM_CUTOFF``, ``SPAM_CUTOFF`` and ``ABUSE_CUTOFF`` can be ommited in which case the default cutoffs are ``0.3``, ``0.7`` and ``3`` respectively.
#. Optionally, if you want an additional **moderate** object tool on admin change views, configure ``django-apptemplates`` as described `here <http://pypi.python.org/pypi/django-apptemplates>`_ , include ``moderator`` as an ``INSTALLED_APP`` before ``django.contrib.admin`` and add ``moderator.admin.AdminModeratorMixin`` as a base class to those admin classes you want the tool available for.
Additional Settings
-------------------
#. By default all comments are classifed as they are created. You can however disable this behaviour by specifying ``REALTIME_CLASSIFICATION`` as ``False``, i.e.::
MODERATOR = {
...
'REALTIME_CLASSIFICATION': False,
...
}
#. By default moderator comment replies are posted chronologically **after** the comment being replied to. If however you need replies to be posted **before** the comment being replied to(for example if you display your comments reverse cronologically), you can specify ``REPLY_BEFORE_COMMENT`` as ``True``, i.e.::
MODERATOR = {
...
'REPLY_BEFORE_COMMENT': True,
...
}
Classifier Storage Backends
---------------------------
``django-moderator`` includes two SpamBayes_ storage backends, ``moderator.storage.DjangoClassifier`` and ``moderator.storage.RedisClassifier`` respectively.
.. note::
``moderator.storage.RedisClassifier`` is recommended for production environments as it should be much faster than ``moderator.storage.DjangoClassifier``.
To use ``moderator.storage.RedisClassifier`` as your classifier storage backend specify it in your ``MODERATOR`` setting, i.e.::
MODERATOR = {
'CLASSIFIER': 'moderator.storage.RedisClassifier',
'CLASSIFIER_CONFIG': {
'host': 'localhost',
'port': 6379,
'db': 0,
'password': None,
},
'HAM_CUTOFF': 0.3,
'SPAM_CUTOFF': 0.7,
'ABUSE_CUTOFF': 3,
}
You can also create your own backends, in which case take note that the content of ``CLASSIFIER_CONFIG`` will be passed as keyword agruments to your backend's ``__init__`` method.
Usage
-----
Once correctly configured you should use the ``traincommentclassifier`` management command to train the Bayesian inference system using a sample of existing comment objects (comments with ``is_removed`` as ``True`` will be trained as *spam*, *ham* otherwise), i.e.::
$ ./manage.py traincommentclassifier
.. note::
The ``traincommentclassifier`` command will remove/clear any existing classification data and start from scratch.
Then you can periodically use the ``classifycomments`` management command to automatically classify comments as either *ham*, *spam*, *reported* or *unsure* based on user reports and previous training, i.e.::
$ ./manage.py classifycomments
Comments can be manually classified as either *ham* or *spam* via admin list view actions.
.. _SpamBayes: http://spambayes.sourceforge.net/
Authors
=======
Praekelt Foundation
-------------------
* Shaun Sephton
Changelog
=========
0.1.3 (2013-03-07)
------------------
#. Include fixtures.
0.1.2 (2013-03-07)
------------------
#. Include fixtures.
0.1.1 (2013-03-07)
------------------
#. Added elivated abuse reporting functionality.
0.1.0 (2013-03-07)
------------------
#. Realtime classification option.
#. Mark spam with reply action.
#. Post replies before comment option.
0.0.9 (2013-02-18)
------------------
#. Further speed optimizations.
0.0.8 (2013-02-18)
------------------
#. Admin speed optimizations.
#. Add moderator reply admin action.
0.0.7 (2013-01-28)
------------------
#. Added moderate admin change view tool.
0.0.6 (2013-01-24)
------------------
#. Added site field for canned replies and filter accordingly on comment admin views.
0.0.5 (2012-12-03)
------------------
#. Added ``traincommentclassifier`` management command.
#. Admin proxy model additions to clearly group comments.
#. Various optimizations.
0.0.4 (2012-08-29)
------------------
#. Migration to add moderator_commentreply model.
0.0.3 (2012-08-29)
------------------
#. Include templates.
0.0.2 (2012-08-29)
------------------
#. Wide range of changes allowing for reporting of abusive comments by users.
0.0.1 (2012-05-23)
------------------
#. Initial release
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
django-moderator-1.0.0.tar.gz
(23.3 kB
view details)
File details
Details for the file django-moderator-1.0.0.tar.gz
.
File metadata
- Download URL: django-moderator-1.0.0.tar.gz
- Upload date:
- Size: 23.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 73553f22b15f167caf7a4fca282ece162043711a474ee83ad8c06b2034347a76 |
|
MD5 | 4e3d8008f5b0309eb6d3ebf57da22909 |
|
BLAKE2b-256 | bbbc3ba886eff275be2175ec28274a97f826b25e80a8306d7cb22b0cd5083047 |