A helper library full of URL-related heuristics.
Project description
Ural
A helper library full of URL-related heuristics.
Installation
You can install ural
with pip with the following command:
pip install ural
Usage
normalize_url
Function normalizing the given url by stripping it of usually non-discriminant parts such as irrelevant query items or sub-domains etc.
This is a very useful utility when attempting to match similar urls written slightly differently when shared on social media etc.
from ural import normalize_url
normalize_url('https://www2.lemonde.fr/index.php#anchor')
>>> 'lemonde.fr'
Arguments
- url string: URL to normalize.
- strip_trailing_slash boolean [
False
]: whether to strip trailing slash.
ensure_protocol
A function checking if the url has a protocol, and adding the given one if there is none.
from ural import ensure_protocol
ensure_protocol('www2.lemonde.fr', protocol='https')
>>> 'https://www2.lemonde.fr'
Arguments
- url string: URL to format.
- protocol string: protocol to use if there is none in url. Is 'http' by default.
force_protocol
A function force-replacing the protocol of the given url.
from ural import force_protocol
force_protocol('https://www2.lemonde.fr', protocol='ftp')
>>> 'ftp://www2.lemonde.fr'
Arguments
- url string: URL to format.
- protocol string: protocol wanted in the output url. Is 'http' by default.
strip_protocol
Function removing the protocol from the url.
from ural import strip_protocol
strip_protocol('https://www2.lemonde.fr/index.php')
>>> 'www2.lemonde.fr/index.php'
Arguments
- url string: URL to format.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.