Detect duplicates in the Wagtail images library.
Project description
Wagtail Images De-duplicator
wagtail-images-deduplicator
is a Wagtail app to detect duplicate images in the admin. It's built with imagehash
.
Requirements
Wagtail Images De-duplicator works with wagtail>=3.0
.
Installation
Use pip
to install this package:
pip install wagtail-images-deduplicator
Configuration
-
Add
wagtail_images_deduplicator
to yourINSTALLED_APPS
in your project's settings. -
Add the
DuplicateFindingMixin
to your custom image model. An example of doing it is shown below:
from wagtail.images.models import Image, AbstractImage, AbstractRendition
from wagtail_images_deduplicator.models import DuplicateFindingMixin
class CustomImage(DuplicateFindingMixin, AbstractImage):
admin_form_fields = Image.admin_form_fields
class CustomRendition(AbstractRendition):
image = models.ForeignKey(
CustomImage, on_delete=models.CASCADE, related_name="renditions"
)
class Meta:
unique_together = (("image", "filter_spec", "focal_point_key"),)
If you choose to add the mixin and have existing image data, you will need to call save()
on all existing instances to fill in the new hash value:
from wagtail.images import get_image_model
for image in get_image_model().objects.all():
image.save()
Settings
WAGTAILIMAGESDEDUPLICATOR_HASH_FUNC
This setting determines the hash function to use.
Hash function | Reference | Setting name |
---|---|---|
Average hashing | http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html | average_hash |
Perceptual hashing | http://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html | phash (default) |
Difference hashing | http://www.hackerfactor.com/blog/index.php?/archives/529-Kind-of-Like-That.html | dhash or dhash_vertical |
Wavelet hashing | https://fullstackml.com/2016/07/02/wavelet-image-hash-in-python/ | whash |
HSV color hashing | colorhash |
|
Crop-resistant hashing | https://ieeexplore.ieee.org/document/6980335 | crop_resistant_hash |
WAGTAILIMAGESDEDUPLICATOR_MAX_DISTANCE_THRESOLD
This setting determines the maximum distance between 2 images to consider them as duplicates.
The default value is 5.
To help you assess how these different algorithms behave and to learn more about hash distances, check out the examples section of the imagehash library's README.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file wagtail-images-deduplicator-1.0a1.tar.gz
.
File metadata
- Download URL: wagtail-images-deduplicator-1.0a1.tar.gz
- Upload date:
- Size: 7.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1fc949c3b7e3ac4fc096fde44c3b0e541063108748e55e4df454f8ede5856db4 |
|
MD5 | a707328438bfeb01de3763f2ce6c33c2 |
|
BLAKE2b-256 | 0fee4189203b649062a884c7670b79564810e8f76f4b9470cfa2ee0b5e58a6dd |
File details
Details for the file wagtail_images_deduplicator-1.0a1-py3-none-any.whl
.
File metadata
- Download URL: wagtail_images_deduplicator-1.0a1-py3-none-any.whl
- Upload date:
- Size: 7.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.10.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa6d0115eb2f911be45f9a0e33435f1b688c5718767a1741be8ca5a9a3bd2b8d |
|
MD5 | 086c39267b9fae963d87e1a199642c5a |
|
BLAKE2b-256 | cd2afa16a831676c80d9fb6a4ea88366b98138c676193d01417e625c4615209f |