Parse strings and extract normalized temporal data.
Project description
datection
Multilingual library for normalisation and rendering of temporal expressions.
How to use it?
Normalisation
The normalisation step extracts temporal expressions from a text, using a language specific grammar, and exports them into a short, storable format.
Example
>>> import datection
>>> from datetime import datetime
# simple datetime
>>> datection.export(u"Le 4 mars 2015 à 18h30", "fr")
[{'duration': 0,
'rrule': 'DTSTART:20150304\nRRULE:FREQ=DAILY;COUNT=1;BYMINUTE=30;BYHOUR=18',
'span': (0, 23)}]
# date interval with a recurrent exclusion
>>> datection.export(u"Du 5 au 29 mars 2015, sauf le lundi", "fr")
[{'duration': 1439,
'excluded': ['DTSTART:20150305\nRRULE:FREQ=DAILY;BYDAY=MO;BYHOUR=0;BYMINUTE=0;UNTIL=20150329T000000'],
'rrule': 'DTSTART:20150305\nRRULE:FREQ=DAILY;BYHOUR=0;BYMINUTE=0;INTERVAL=1;UNTIL=20150329',
'span': (0, 36)}]
# yearless date, with argument date reference
>>> datection.export(u"Le 4 mars à 18h30", "fr", reference=datetime(2015, 1, 1))
[{'duration': 0,
'rrule': 'DTSTART:20150304\nRRULE:FREQ=DAILY;COUNT=1;BYMINUTE=30;BYHOUR=18',
'span': (0, 18)}]
# past datetime
>>> datection.export(u"Le 4 mars 1990 à 18h30", "fr")
[]
# past datetime and authorized past exports
>>> datection.export(u"Le 4 mars 1990 à 18h30", "fr", only_future=False)
[{'duration': 0,
'rrule': 'DTSTART:19900304\nRRULE:FREQ=DAILY;COUNT=1;BYMINUTE=30;BYHOUR=18',
'span': (0, 18)}]
# continuous datetime interval
>>> datection.export(u"Du 5 avril à 22h au 6 avril 2015 à 8h", "fr")
[{'continuous': True,
'duration': 600,
'rrule': 'DTSTART:20150405\nRRULE:FREQ=DAILY;BYHOUR=22;BYMINUTE=0;INTERVAL=1;UNTIL=20150406T235959',
'span': (0, 38)}]
Export format
The export format contains 6 different items:
-
rrule
: a parseable expression, generating all the datetimes described by the expression. See the python-dateutil documentation and RFC 2445 for more details -
duration
: the duration (in minutes) between each start datetime, egenrated by the rrule, and its end counterpart:- 8h → 9h: duration = 60
- at 8pm: duration = 0
- all day: duration = 1439
-
span
: the character interval defining where the temporal expression was found in the text -
continuous
: boolean flag, indicating if the time interval is continuous or not. -
excluded
: a list of rrules exclusion rrules. -
unlimited
: if True, the rrules are considered as infinite.
Rendering
The rendering step renders the export format in human readable formats, in a specific language.
Several formats can be chosen from:
- default
- short: shorter than the default output, omits some information when possible (the year, for example), and contextualize the result
- place: display the export as opening hours
- SEO: synthetic information, only displaying the month and the year. Used for SEO purposes.
>>> import datection
>>> schedule = datection.export(u"Le 5 mars 2015, 15h30 - 16h", "fr")
# default
>>> datection.display(schedule, 'fr')
u'Le 5 mars 2015 de 15 h 30 à 16 h'
# short
>>> datection.display(schedule, 'fr', short=True)
u'Le 5 mars de 15 h 30 à 16 h'
>>> datection.display(schedule, 'fr', short=True, reference=date(2015, 3, 3))
u'Ce jeudi de 15 h 30 à 16 h'
>>> datection.display(schedule, 'fr', short=True, reference=date(2015, 3, 4))
u'Demain de 15 h 30 à 16 h'
>>> datection.display(schedule, 'fr', short=True, reference=date(2015, 3, 5))
u"Aujourd'hui de 15 h 30 ç 16 h"
# SEO
>>> datection.display(schedule, 'fr', seo=True)
u'mars 2015'
# opening hours / place
>>> schedule = datection.export(u"Du lundi au vendredi de 8h à 12h30 et de 14h à 19h30", "fr")
>>> datection.display(schedule, 'fr', place=True)
u"""Lundi de 8 h à 12 h 30 et de 14 h à 19 h 30
Mardi de 8 h à 12 h 30 et de 14 h à 19 h 30
Mercredi de 8 h à 12 h 30 et de 14 h à 19 h 30
Jeudi de 8 h à 12 h 30 et de 14 h à 19 h 30
Vendredi de 8 h à 12 h 30 et de 14 h à 19 h 30
"""
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file datection-4.0.7.tar.gz
.
File metadata
- Download URL: datection-4.0.7.tar.gz
- Upload date:
- Size: 114.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27f868bf03cbbfc73f70123f5f6b498a39fdf204eeefe29f253e80f2d9a950fb |
|
MD5 | b1d868da9ae7f953e496225567f6e29f |
|
BLAKE2b-256 | acbde9a78eb0108160232aef43be8608593eef34b33790b9fd435c0e9cbee68b |
File details
Details for the file datection-4.0.7-py3-none-any.whl
.
File metadata
- Download URL: datection-4.0.7-py3-none-any.whl
- Upload date:
- Size: 145.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a6bba5b6a7a0b5a8a6b120d1e4e8574d4c57e21956cf9a7e417c208f5b0edb16 |
|
MD5 | d21a87cb7f41a8ff0c4e9010f7aa06a2 |
|
BLAKE2b-256 | 10a25c1392d01bee4d29fc6000a8f1a3b2bd9714a3801e313bb885e264118632 |