Skip to main content

A thin compatibility layer to use Javascript regular expressions in Python

Project description

js-regex

A thin compatibility layer to use Javascript regular expressions in Python.

Did you know that regular expressions may vary between programming languages? For example, let's consider the pattern "^abc$", which matches the string "abc". But what about the string "abc\n"? It's also matched in Python, but not in Javascript!

This and other slight differences can be really important for cross-language standards like jsonschema, and that's why js-regex exists.

How it works

import re
import js_regex

re.compile("^abc$").match("abc\n")  # matches, unlike JS
js_regex.compile("^abc$").match("abc\n")  # does not match

Internally, js_regex.compile() replaces JS regex syntax which has a different meaning in Python with whatever Python regex syntax has the intended meaning.

We also check for constructs which are valid in Python but not JS - such as named capture groups - and raise an explicit error. Constructs which are valid in JS but not Python are also an error, because we're still using Python's re.compile() function under the hood!

The following table is adapted from this larger version, ommiting other languages and any rows where JS and Python have the same behaviour.

Feature Javascript Python Handling
\a (bell) no yes Converted to JS behaviour
\ca-\cz and \cA-\cZ (control characters) yes no Converted to JS behaviour
\d for digits, \w for word chars, \s for whitespace ascii unicode Converted to JS behaviour (including \D, \W, \S for negated classes)
$ (end of line/string) at end allows trailing \n Converted to JS behaviour
\A (start of string) no yes Explicit error, use ^ instead
\Z (end of string) no yes Explicit error, use $ instead
(?<=text) (positive lookbehind) new in ES2018 yes Allowed
(?<!text) (negative lookbehind) new in ES2018 yes Allowed
(?(1)then|else) no yes Explicit error
(?(group)then|else) no yes Explicit error
(?#comment) no yes Explicit error
(?P<name>regex) (Python named capture group) no yes Not detected (yet)
(?P=name) (Python named backreference) no yes Not detected (yet)
(?<name>regex) (JS named capture group) new in ES2018 no TODO: translate to Python equivalent
$<name> (JS named backreference) new in ES2018 no TODO: translate to Python equivalent
(?i) (case insensitive) /i only yes Explicit error, compile with flags=re.IGNORECASE instead
(?m) (^ and $ match at line breaks) /m only yes Explicit error, compile with flags=re.MULTILINE instead
(?s) (dot matches newlines) no yes Explicit error, compile with flags=re.DOTALL instead
(?x) (free-spacing mode) no yes Explicit error, there is no corresponding mode in Javascript
Backreferences non-existent groups are an error no yes Follows Python behaviour
Backreferences to failed groups also fail no yes Follows Python behaviour
Nested references \1 through \9 yes no Follows Python behaviour

Note that in many cases Python-only regex features would be treated as part of an ordinary pattern by JS regex engines. Currently we raise an explicit error on such inputs, but may translate them to have the JS behaviour in a future version.

Changelog

0.3.0 - 2019-09-30

  • Fixed handling of non-trailing $, e.g. in "^abc$|^def$" both are converted
  • Added explicit errors for re.LOCALE and re.VERBOSE flags, which have no JS equivalent
  • Added explicit checks and errors for use of Python-only regex features

0.2.0 - 2019-09-28

Convert JS-only syntax to Python equivalent wherever possible.

0.1.0 - 2019-09-28

Initial release, with project setup and a very basic implementation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

js-regex-0.3.0.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

js_regex-0.3.0-py3-none-any.whl (11.6 kB view details)

Uploaded Python 3

File details

Details for the file js-regex-0.3.0.tar.gz.

File metadata

  • Download URL: js-regex-0.3.0.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.7

File hashes

Hashes for js-regex-0.3.0.tar.gz
Algorithm Hash digest
SHA256 44a1b0a32bcf0eeb52aa8b11ff9f375252bdf26dd37ae1066660d1cb7eb34363
MD5 7891acf1d68af11dc638b3690c900735
BLAKE2b-256 23fd01e6f7573c57ea7a324cb5d838e4edd7f058a2abcebff621416521b1aa46

See more details on using hashes here.

Provenance

File details

Details for the file js_regex-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: js_regex-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 11.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.7

File hashes

Hashes for js_regex-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e2d21c560faf47f7002033c7150df91e7e2fc1d97f2f35117a2a2254a53ba719
MD5 c42da86d0de1a808cc9d5360d9d99d97
BLAKE2b-256 d248f478a2e9eb39111bed50e07537d84d46e0e1649a48c1ef3e0d81d5a846ed

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page