Skip to main content

Strips outputs from Jupyter notebooks

Project description

nbstripout-fast

PyPI version PyPI DM Github Actions Status

A much faster version of nbstripout by writing it in rust (of course). This helps strip Jupyter Notebook output and metadata from notebooks. It is very useful as a git filter and is highly configurable.

Installation

pip install nbstripout-fast

Then replace nbstripout-fast with anywhere you use nbstripout.

Key differences

  1. While we mirrored most of nbstripout's API, we do not support every nbstripout option.
  2. There is no CLI option to install this in git for you
  3. We support repository level settings in a .git-nbconfig.yaml file. Check out our examples. On a high level, you can add a git filter in a sitewide/user level and then allow each project to enforce consistent settings.

Why Rust?

nbstripout is a excellent project, but the python startup and import time makes its usage at scale a bit painful. While this means giving up on using nbconvert under the hood and ensuring the notebook is the correct format, it does make things up to 200x faster. This matters when you have a large number of files and git filter is called sometimes more than once per file. Let's look at the data:

Cells nbstripout nbstripout_fast
1 0m0.266s 0m0.003s
10 0m0.258s 0m0.003s
100 0m0.280s 0m0.004s
1000 0m0.372s 0m0.013s
10000 0m1.649s 0m0.133s

The table above shows a large overhead per notebook (mostly python startup time). When you have 100 or more notebooks, nbstripout takes more than 40s while nbstripout-fast takes only 1s!

Developing

You can use cargo which will build + run the CLI:

cargo run -- -t examples/example.ipynb

You can also build with cargo and run the script with the full path:

cargo build # dev build - ./target/debug/nbstripout-fast
cargo build --release # release build - ./target/release/nbstripout-fast

Running unit tests: maturin builds this repo to include pyo3 bindings by default. This allows for us to have an extension python extension mode as well. As of today, we can't have a binary and an extension, so we use the extension only for testing (issue).

pip install -e .
maturin develop
# Should output, this way you can use RUST_LOG=debug
in-venv pytest -rP

Debugging

Use RUST_LOG=debug to debug script for example:

RUST_LOG=debug cargo run -- '--extra-keys "metadata.bar cell.baz" -t foo.ipynb'

Releasing

Manylinux, macos, and windows wheels and sdist are built by github workflows. Builds are triggered upon the creation of a pull request, creating a new release, or with a manual workflow dispatch. The wheels and sdist are only uploaded to PyPI when a new release is published. In order to create a new release:

  1. Create a commit updating the version in Cargo.toml and CHANGELOG.md, then create a git tag:
git tag vX.Y.Z
git push --tags
  1. Draft a new release in github; select the tag that you just created.
  2. Once the new release is created, the wheels and sdist will be built by a github workflow and then uploaded to PyPI automatically using the PYPI_API_TOKEN in the github secrets for the repository.

History

This plugin was contributed back to the community by the D. E. Shaw group.

D. E. Shaw Logo

License

This project is released under a BSD-3-Clause license.

We love contributions! Before you can contribute, please sign and submit this Contributor License Agreement (CLA). This CLA is in place to protect all users of this project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nbstripout_fast-1.0.4.tar.gz (24.3 kB view details)

Uploaded Source

Built Distributions

nbstripout_fast-1.0.4-py3-none-win_amd64.whl (933.8 kB view details)

Uploaded Python 3 Windows x86-64

nbstripout_fast-1.0.4-py3-none-win32.whl (866.1 kB view details)

Uploaded Python 3 Windows x86

nbstripout_fast-1.0.4-py3-none-musllinux_1_1_x86_64.whl (1.2 MB view details)

Uploaded Python 3 musllinux: musl 1.1+ x86-64

nbstripout_fast-1.0.4-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view details)

Uploaded Python 3 manylinux: glibc 2.17+ x86-64

nbstripout_fast-1.0.4-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl (1.3 MB view details)

Uploaded Python 3 manylinux: glibc 2.17+ i686

nbstripout_fast-1.0.4-py3-none-macosx_11_0_arm64.whl (1.1 MB view details)

Uploaded Python 3 macOS 11.0+ ARM64

File details

Details for the file nbstripout_fast-1.0.4.tar.gz.

File metadata

  • Download URL: nbstripout_fast-1.0.4.tar.gz
  • Upload date:
  • Size: 24.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for nbstripout_fast-1.0.4.tar.gz
Algorithm Hash digest
SHA256 cf30a3a40874714fc0124cac9c15a3532cb01ec74c2977bbced4b035dabfa04d
MD5 b10bbe8634b5d5a522df2a2c72d95174
BLAKE2b-256 61eaacb44d2c0b9f6f43cf1a55cb019712b8664279b9e80ce081c274110b279e

See more details on using hashes here.

File details

Details for the file nbstripout_fast-1.0.4-py3-none-win_amd64.whl.

File metadata

File hashes

Hashes for nbstripout_fast-1.0.4-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 34339c3e43e8686fcab4ea4bdad5cba2217f6ddf5a1a6cf0be154e9c56a76e83
MD5 079ddebc56d96da078f94533a939eed6
BLAKE2b-256 38d383085064ace94f8fc4ec75830cec406d86bffed4959d41fbe68c07c1dcca

See more details on using hashes here.

File details

Details for the file nbstripout_fast-1.0.4-py3-none-win32.whl.

File metadata

File hashes

Hashes for nbstripout_fast-1.0.4-py3-none-win32.whl
Algorithm Hash digest
SHA256 9dd8b5f82f83f372aae8f8f2443cba4a1527231e5ebdc1659680f3fc36556478
MD5 f7ea08dca9262eedcfab4376c0375870
BLAKE2b-256 ef00388e5ce50bddbfd7eda04774d9c6070b0e61b9ebbd9fd22e8106a1b4ec00

See more details on using hashes here.

File details

Details for the file nbstripout_fast-1.0.4-py3-none-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for nbstripout_fast-1.0.4-py3-none-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 56c2ff216d89949c27d5c8a259c1cc55fb31e86a79392c7384aa624f41f39fd7
MD5 0f52db47fc8fd047a21e6b1765bfa4e0
BLAKE2b-256 f93a950ccc609595970b7a88fcc50d3bab6ff61259324c6a056d148145b84588

See more details on using hashes here.

File details

Details for the file nbstripout_fast-1.0.4-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for nbstripout_fast-1.0.4-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c3e26e2ed4ab174de3c8b9a8c9ab3173189630f31407988727088529e964c0e8
MD5 5d0d0a6760a2b6a87b3388af7ba8f8df
BLAKE2b-256 a408c18c28a6b98dfeabb358bb9a61d6551495d0083108af507556c45e672185

See more details on using hashes here.

File details

Details for the file nbstripout_fast-1.0.4-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for nbstripout_fast-1.0.4-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 f49d46db17e013782c3b09b3ff15959c0a1d20223901e874c8dceeb377441ece
MD5 08ccfc87b082694be843761291bf635b
BLAKE2b-256 f98d35a740272174a457dfd47eb3b2ca8294ab2ec6e3f6fe7ee14df20052e57a

See more details on using hashes here.

File details

Details for the file nbstripout_fast-1.0.4-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for nbstripout_fast-1.0.4-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 7dc6be725d3d2d90455db5d97241e70377859e4d3ba55101f0cc1c1672d585b7
MD5 4004426423427b0dcb663012f813fdeb
BLAKE2b-256 fb0ba836da967d4bae1804c04ab2bfa68fa54ee5ce21e966dcac666264153703

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page