Skip to main content

Parse strings and extract normalized temporal data.

Project description

datection

Multilingual library for normalisation and rendering of temporal expressions.

How to use it?

Normalisation

The normalisation step extracts temporal expressions from a text, using a language specific grammar, and exports them into a short, storable format.

Example

>>> import datection

>>> from datetime import datetime

# simple datetime
>>> datection.export(u"Le 4 mars 2015 à 18h30", "fr")
[{'duration': 0,
  'rrule': 'DTSTART:20150304\nRRULE:FREQ=DAILY;COUNT=1;BYMINUTE=30;BYHOUR=18',
  'span': (0, 23)}]

# date interval with a recurrent exclusion
>>> datection.export(u"Du 5 au 29 mars 2015, sauf le lundi", "fr")
[{'duration': 1439,
  'excluded': ['DTSTART:20150305\nRRULE:FREQ=DAILY;BYDAY=MO;BYHOUR=0;BYMINUTE=0;UNTIL=20150329T000000'],
  'rrule': 'DTSTART:20150305\nRRULE:FREQ=DAILY;BYHOUR=0;BYMINUTE=0;INTERVAL=1;UNTIL=20150329',
  'span': (0, 36)}]

# yearless date, with argument date reference
>>> datection.export(u"Le 4 mars à 18h30", "fr", reference=datetime(2015, 1, 1))
[{'duration': 0,
  'rrule': 'DTSTART:20150304\nRRULE:FREQ=DAILY;COUNT=1;BYMINUTE=30;BYHOUR=18',
  'span': (0, 18)}]

# past datetime
>>> datection.export(u"Le 4 mars 1990 à 18h30", "fr")
[]

# past datetime and authorized past exports
>>> datection.export(u"Le 4 mars 1990 à 18h30", "fr", only_future=False)
[{'duration': 0,
  'rrule': 'DTSTART:19900304\nRRULE:FREQ=DAILY;COUNT=1;BYMINUTE=30;BYHOUR=18',
  'span': (0, 18)}]

# continuous datetime interval
>>> datection.export(u"Du 5 avril à 22h au 6 avril 2015 à 8h", "fr")
[{'continuous': True,
  'duration': 600,
  'rrule': 'DTSTART:20150405\nRRULE:FREQ=DAILY;BYHOUR=22;BYMINUTE=0;INTERVAL=1;UNTIL=20150406T235959',
  'span': (0, 38)}]

Export format

The export format contains 6 different items:

  • rrule: a parseable expression, generating all the datetimes described by the expression. See the python-dateutil documentation and RFC 2445 for more details

  • duration: the duration (in minutes) between each start datetime, egenrated by the rrule, and its end counterpart:

    • 8h → 9h: duration = 60
    • at 8pm: duration = 0
    • all day: duration = 1439
  • span: the character interval defining where the temporal expression was found in the text

  • continuous: boolean flag, indicating if the time interval is continuous or not.

  • excluded: a list of rrules exclusion rrules.

  • unlimited: if True, the rrules are considered as infinite.

Rendering

The rendering step renders the export format in human readable formats, in a specific language.

Several formats can be chosen from:

  • default
  • short: shorter than the default output, omits some information when possible (the year, for example), and contextualize the result
  • place: display the export as opening hours
  • SEO: synthetic information, only displaying the month and the year. Used for SEO purposes.
>>> import datection
>>> schedule = datection.export(u"Le 5 mars 2015, 15h30 - 16h", "fr")

# default
>>> datection.display(schedule, 'fr')
u'Le 5 mars 2015 de 15 h 30 à 16 h'

# short
>>> datection.display(schedule, 'fr', short=True)
u'Le 5 mars de 15 h 30 à 16 h'
>>> datection.display(schedule, 'fr', short=True, reference=date(2015, 3, 3))
u'Ce jeudi de 15 h 30 à 16 h'
>>> datection.display(schedule, 'fr', short=True, reference=date(2015, 3, 4))
u'Demain de 15 h 30 à 16 h'
>>> datection.display(schedule, 'fr', short=True, reference=date(2015, 3, 5))
u"Aujourd'hui de 15 h 30 ç 16 h"

# SEO
>>> datection.display(schedule, 'fr', seo=True)
u'mars 2015'

# opening hours / place
>>> schedule = datection.export(u"Du lundi au vendredi de 8h à 12h30 et de 14h à 19h30", "fr")
>>> datection.display(schedule, 'fr', place=True)
u"""Lundi de 8 h à 12 h 30 et de 14 h à 19 h 30
Mardi de 8 h à 12 h 30 et de 14 h à 19 h 30
Mercredi de 8 h à 12 h 30 et de 14 h à 19 h 30
Jeudi de 8 h à 12 h 30 et de 14 h à 19 h 30
Vendredi de 8 h à 12 h 30 et de 14 h à 19 h 30
"""

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datection-4.0.7.tar.gz (114.1 kB view details)

Uploaded Source

Built Distribution

datection-4.0.7-py3-none-any.whl (145.4 kB view details)

Uploaded Python 3

File details

Details for the file datection-4.0.7.tar.gz.

File metadata

  • Download URL: datection-4.0.7.tar.gz
  • Upload date:
  • Size: 114.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.7.3

File hashes

Hashes for datection-4.0.7.tar.gz
Algorithm Hash digest
SHA256 27f868bf03cbbfc73f70123f5f6b498a39fdf204eeefe29f253e80f2d9a950fb
MD5 b1d868da9ae7f953e496225567f6e29f
BLAKE2b-256 acbde9a78eb0108160232aef43be8608593eef34b33790b9fd435c0e9cbee68b

See more details on using hashes here.

File details

Details for the file datection-4.0.7-py3-none-any.whl.

File metadata

  • Download URL: datection-4.0.7-py3-none-any.whl
  • Upload date:
  • Size: 145.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.7.3

File hashes

Hashes for datection-4.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 a6bba5b6a7a0b5a8a6b120d1e4e8574d4c57e21956cf9a7e417c208f5b0edb16
MD5 d21a87cb7f41a8ff0c4e9010f7aa06a2
BLAKE2b-256 10a25c1392d01bee4d29fc6000a8f1a3b2bd9714a3801e313bb885e264118632

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page