Skip to main content

The rules for duplicate-url-discarder.

Project description

PyPI Version Supported Python Versions Build Status:

This contains rules for https://github.com/zytedata/duplicate-url-discarder.

Quick Start

Installation

pip install duplicate-url-discarder-rules

Using

The rules can be imported via:

from duplicate_url_discarder_rules import RULE_PATHS

It can then be used to configure the DUD_LOAD_RULE_PATHS setting of duplicate-url-discarder.

RULE_PATHS contains all files shipped in this package. If you want to reduce the number of loaded rules to improve performance you can instead use one or more of the following variables:

  • RULE_PATHS_COMMON: rules not specific to any data type.

  • RULE_PATHS_ARTICLE: rules for article websites.

  • RULE_PATHS_PRODUCT: rules for e-commerce websites.

As all of them are lists, you can combine them (e.g. RULE_PATHS_COMMON + RULE_PATHS_PRODUCT).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

File details

Details for the file duplicate_url_discarder_rules-2024.11.5.tar.gz.

File metadata

File hashes

Hashes for duplicate_url_discarder_rules-2024.11.5.tar.gz
Algorithm Hash digest
SHA256 0115ea79c995f60b4917bb4647aa624152458d343ce8e06aa43072106dafa49b
MD5 4acbb7e3c2979cb94569415c89843424
BLAKE2b-256 fee2d3f45564162511e8653fd1578a25337d7a2674da42f7762b87293692aa5b

See more details on using hashes here.

File details

Details for the file duplicate_url_discarder_rules-2024.11.5-py3-none-any.whl.

File metadata

File hashes

Hashes for duplicate_url_discarder_rules-2024.11.5-py3-none-any.whl
Algorithm Hash digest
SHA256 840a4f6a7397da07312f09438de4441a165a7e608962e8df7531b0f083b50ad4
MD5 157f328cf5c074f088f8bccdf6b5ef02
BLAKE2b-256 89c8b457ba696d5230edc947a3597d5dbf57617f160ecf501332f0a42314e654

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page