Skip to main content

No project description provided

Project description

regex-rust (regexrs)

Leverages the Rust regex crate with PyO3 to create an interface similar to the Python standard library re module.

pip install regex-rust
>>> import regexrs as re
>>> pattern = re.compile(r'(\w+) (\w+)')
>>> m = pattern.match('hello rust')
>>> m.groups()
('hello', 'rust')
>>> m.pos
0
>>> m.endpos
10
>>> re.findall(r'\w+', 'hello rust')
['hello', 'rust']
>>> re.fullmatch(r'\w+', 'foo')
<regexrs.Match object; span=(0, 3), match="foo">

Benchmarks

benchmark.py is largely borrowed from the regex-benchmark project. You are expected to pass in a path to the file of the input-text.txt file to benchmark.py.

This simple benchmark suggests that regexrs may be significantly faster than the re module from the standard library or even the regex library, at least in some use cases. Keep in mind that this benchmark tests just three simple use cases on a single large text input. Therefore, the insights we can infer from this benchmark are quite limited.

Results as tested on Windows AMD64 Python 3.12.2 using pgo-optimized build - times in ms (lower is better):

test regexrs re (stdlib) regex Compared to re
Email 12.51 354.53 690.15 28.34x faster
URI 4.82 282.69 430.26 58.65x faster
IP 4.71 321.37 25.43 68.23x faster

To run the benchmarks yourself:

# be sure to have run `pip install regex-rust` first
# to test regexrs:
python benchmark.py /path/to/input-text.txt

# to test stdlib re:
python benchmark.py /path/to/input-text.txt re

# be sure to have run `pip install regex` first
# to test regex library:
python benchmark.py /path/to/input-text.txt regex

How to install from source

You can use pip to build and install.

pip install .

If you want to build manually:

pip install maturin
maturin build --release

Status

Mostly incomplete and likely very buggy. I am using this mostly as an exercise in creating and distributing Python extensions using Rust and PyO3. It's unclear if this will ever be a particularly useful project or not. If you're looking for a complete and performant regex library for Python today, see the regex project on PyPI.

Differences compared to standard lib:

  • The endpos argument normally found in the re module is not supported in regexrs for the match/search/findall/finditer methods.
  • Some regex features are not supported (because they are not supported by the regex crate), such as lookarounds and backreferences.
  • Not all flags are supported. At present release, you may use the flags IGNORECASE, MULTILINE, DOTALL and VERBOSE (or their shorthand equivalents). These are translated to inline flags and prepended to your given pattern.
  • Until a future release, there is no cache for avoiding re-compiling the same patterns multiple times

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

regex_rust-0.4.0rc3.tar.gz (14.8 kB view hashes)

Uploaded Source

Built Distributions

regex_rust-0.4.0rc3-pp310-pypy310_pp73-win_amd64.whl (719.6 kB view hashes)

Uploaded PyPy Windows x86-64

regex_rust-0.4.0rc3-pp310-pypy310_pp73-musllinux_1_1_x86_64.whl (1.9 MB view hashes)

Uploaded PyPy musllinux: musl 1.1+ x86-64

regex_rust-0.4.0rc3-pp310-pypy310_pp73-musllinux_1_1_aarch64.whl (1.9 MB view hashes)

Uploaded PyPy musllinux: musl 1.1+ ARM64

regex_rust-0.4.0rc3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

regex_rust-0.4.0rc3-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded PyPy manylinux: glibc 2.5+ i686

regex_rust-0.4.0rc3-pp310-pypy310_pp73-macosx_11_0_arm64.whl (832.9 kB view hashes)

Uploaded PyPy macOS 11.0+ ARM64

regex_rust-0.4.0rc3-pp310-pypy310_pp73-macosx_10_12_x86_64.whl (878.7 kB view hashes)

Uploaded PyPy macOS 10.12+ x86-64

regex_rust-0.4.0rc3-pp39-pypy39_pp73-win_amd64.whl (719.8 kB view hashes)

Uploaded PyPy Windows x86-64

regex_rust-0.4.0rc3-pp39-pypy39_pp73-musllinux_1_1_x86_64.whl (1.9 MB view hashes)

Uploaded PyPy musllinux: musl 1.1+ x86-64

regex_rust-0.4.0rc3-pp39-pypy39_pp73-musllinux_1_1_aarch64.whl (1.9 MB view hashes)

Uploaded PyPy musllinux: musl 1.1+ ARM64

regex_rust-0.4.0rc3-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

regex_rust-0.4.0rc3-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.7 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

regex_rust-0.4.0rc3-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded PyPy manylinux: glibc 2.5+ i686

regex_rust-0.4.0rc3-pp39-pypy39_pp73-macosx_11_0_arm64.whl (833.5 kB view hashes)

Uploaded PyPy macOS 11.0+ ARM64

regex_rust-0.4.0rc3-pp39-pypy39_pp73-macosx_10_12_x86_64.whl (879.7 kB view hashes)

Uploaded PyPy macOS 10.12+ x86-64

regex_rust-0.4.0rc3-cp312-none-win_arm64.whl (666.6 kB view hashes)

Uploaded CPython 3.12 Windows ARM64

regex_rust-0.4.0rc3-cp312-none-win_amd64.whl (752.2 kB view hashes)

Uploaded CPython 3.12 Windows x86-64

regex_rust-0.4.0rc3-cp312-none-win32.whl (656.5 kB view hashes)

Uploaded CPython 3.12 Windows x86

regex_rust-0.4.0rc3-cp312-cp312-musllinux_1_1_x86_64.whl (1.9 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.1+ x86-64

regex_rust-0.4.0rc3-cp312-cp312-musllinux_1_1_aarch64.whl (1.9 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.1+ ARM64

regex_rust-0.4.0rc3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (985.7 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

regex_rust-0.4.0rc3-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.9 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ s390x

regex_rust-0.4.0rc3-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.9 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ppc64le

regex_rust-0.4.0rc3-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.7 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARMv7l

regex_rust-0.4.0rc3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.7 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARM64

regex_rust-0.4.0rc3-cp312-cp312-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.5+ i686

regex_rust-0.4.0rc3-cp312-cp312-macosx_10_12_x86_64.whl (879.5 kB view hashes)

Uploaded CPython 3.12 macOS 10.12+ x86-64

regex_rust-0.4.0rc3-cp311-none-win_arm64.whl (665.4 kB view hashes)

Uploaded CPython 3.11 Windows ARM64

regex_rust-0.4.0rc3-cp311-none-win_amd64.whl (752.7 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

regex_rust-0.4.0rc3-cp311-none-win32.whl (655.3 kB view hashes)

Uploaded CPython 3.11 Windows x86

regex_rust-0.4.0rc3-cp311-cp311-musllinux_1_1_x86_64.whl (1.9 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ x86-64

regex_rust-0.4.0rc3-cp311-cp311-musllinux_1_1_aarch64.whl (1.9 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ ARM64

regex_rust-0.4.0rc3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (986.7 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

regex_rust-0.4.0rc3-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.9 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ s390x

regex_rust-0.4.0rc3-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.9 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ppc64le

regex_rust-0.4.0rc3-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.7 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARMv7l

regex_rust-0.4.0rc3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.7 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

regex_rust-0.4.0rc3-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.5+ i686

regex_rust-0.4.0rc3-cp311-cp311-macosx_10_12_x86_64.whl (878.8 kB view hashes)

Uploaded CPython 3.11 macOS 10.12+ x86-64

regex_rust-0.4.0rc3-cp310-none-win_amd64.whl (753.2 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

regex_rust-0.4.0rc3-cp310-none-win32.whl (656.1 kB view hashes)

Uploaded CPython 3.10 Windows x86

regex_rust-0.4.0rc3-cp310-cp310-musllinux_1_1_x86_64.whl (1.9 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

regex_rust-0.4.0rc3-cp310-cp310-musllinux_1_1_aarch64.whl (1.9 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ ARM64

regex_rust-0.4.0rc3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (987.1 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

regex_rust-0.4.0rc3-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.9 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ s390x

regex_rust-0.4.0rc3-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.9 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ppc64le

regex_rust-0.4.0rc3-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.7 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARMv7l

regex_rust-0.4.0rc3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.7 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

regex_rust-0.4.0rc3-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.5+ i686

regex_rust-0.4.0rc3-cp310-cp310-macosx_10_12_x86_64.whl (896.4 kB view hashes)

Uploaded CPython 3.10 macOS 10.12+ x86-64

regex_rust-0.4.0rc3-cp39-none-win_amd64.whl (753.6 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

regex_rust-0.4.0rc3-cp39-none-win32.whl (655.1 kB view hashes)

Uploaded CPython 3.9 Windows x86

regex_rust-0.4.0rc3-cp39-cp39-musllinux_1_1_x86_64.whl (1.9 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ x86-64

regex_rust-0.4.0rc3-cp39-cp39-musllinux_1_1_aarch64.whl (1.9 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ ARM64

regex_rust-0.4.0rc3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (988.2 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

regex_rust-0.4.0rc3-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.9 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ s390x

regex_rust-0.4.0rc3-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.9 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

regex_rust-0.4.0rc3-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.7 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARMv7l

regex_rust-0.4.0rc3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.7 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

regex_rust-0.4.0rc3-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.5+ i686

regex_rust-0.4.0rc3-cp39-cp39-macosx_11_0_arm64.whl (833.9 kB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

regex_rust-0.4.0rc3-cp39-cp39-macosx_10_12_x86_64.whl (878.8 kB view hashes)

Uploaded CPython 3.9 macOS 10.12+ x86-64

regex_rust-0.4.0rc3-cp38-none-win_amd64.whl (753.0 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

regex_rust-0.4.0rc3-cp38-none-win32.whl (655.8 kB view hashes)

Uploaded CPython 3.8 Windows x86

regex_rust-0.4.0rc3-cp38-cp38-musllinux_1_1_x86_64.whl (1.9 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ x86-64

regex_rust-0.4.0rc3-cp38-cp38-musllinux_1_1_aarch64.whl (1.9 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ ARM64

regex_rust-0.4.0rc3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (987.8 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

regex_rust-0.4.0rc3-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.9 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ s390x

regex_rust-0.4.0rc3-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.9 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

regex_rust-0.4.0rc3-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.7 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARMv7l

regex_rust-0.4.0rc3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.7 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

regex_rust-0.4.0rc3-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.whl (1.7 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.5+ i686

regex_rust-0.4.0rc3-cp38-cp38-macosx_11_0_arm64.whl (834.1 kB view hashes)

Uploaded CPython 3.8 macOS 11.0+ ARM64

regex_rust-0.4.0rc3-cp38-cp38-macosx_10_12_x86_64.whl (879.1 kB view hashes)

Uploaded CPython 3.8 macOS 10.12+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page