Skip to main content

Beautiful and pythonic benchmarks engine.

Project description

true-north

Beautiful and pythonic benchmarks engine for Python code.

Features:

  • Follows best practices of benchmarking to produce the most reliable results.
  • Detects caching and side-effects.
  • Opcodes tracing for reproducble benchmarks.
  • 100% type safe.
  • Zero dependency.
  • Highly configurable.
  • Nice and colorful output.
  • Ships with CLI to discover and run all benchmarks.

output example

Installation

python3 -m pip install true-north

Usage

import true_north

group = true_north.Group()

@group.add
def math_sorted(r):
    val = [1, 2, 3] * 300
    # timer start before entering the loop
    # and stops when leaving it
    for _ in r:
        sorted(val)

# run and print all benchmarks in the group
if __name__ == '__main__':
    group.print()

See examples for more examples.

Tracing opcodes

If you run CLI with --opcodes or call Group.print with opcodes=True, the output will also include the number of opcodes executed by the benchmark function. The idea is similar to how benchee counts reductions (function calls) for Erlang code. The difference between measuring execution time and executed opcodes is that the latter is reproducible. There are a few catches, though:

  1. Different version of Python produce different number of opcodes. Always run benchmarks on the same Python interpreter.
  2. Tracing opcodes requires true-north to register multiple tracing hooks, which slows down the code execution. It won't affect the timing benchmarks, but it will take more time to run the suite.
  3. More opcodes doesn't mean slower code. Different opcodes take different time to run. In particular, calling a C function (like sorted) is just one opcode. However, if you compare two pure Python functions that don't use call anything heavy, opcodes will roughly correlate with the execution time.

output example with opcodes

Reading output

Let's take the following benchmark output as an example:

sorting algorithms
  <...omitted some results...>
  heap_sort
    1k   loops, best of 5: 240.487 us ±   4.723 us    x5.68 slower █████
          10_022 ops,   23 ns/op
  • sorting algorithms: the group name.
  • heap_sort: the benchmark name. if not specified explicitly, the benchmark finction name will be used.
  • 1k loops: each time the benchmarking function was called, the loop in it was executed 1000 times. So, if you defined the function as def heap_sort(r), the r value is an iterable object with 1000 items. By default, this value is adjusted to finish benchmarking in a reasonable time, but you can specify it explicitly using loops argument.
  • best of 5: the benchmarking function was called 5 times, and the resulting execution time shown on the right is the best result out of these 5 calls. We do that to minimize how CPU usage by other programs on your machine affects the result. It's 5 by default, but you can change it with repeats argument.
  • 240.487 us: the average execution time of a sinlge loop iteration is about 240 microseconds (ms is 1e−6 of a second).
  • ± 4.723 us: the standard deviation of each loop iteration is 4.723 microseconds. It is a good value. If it gets close to the average execution time, though, the results aren't reliable. I there was only one loop, the standard deviation will be calculated for all repeats instead.
  • x5.68 slower: the average execution time is 5.7 times slower that that of the base benchmark. The base benchmark is the first one in the group. It's always a good idea to have a base benchmark you compare other results to. For example, if you compare your library against other libraries, put the benchmark for your library first to see how you're doing compared to others.
  • █████: a histogram where each block represents one repeat (benchmarking function call). The minimum value is 0 and the maximum value is the slowest repeat. If all blocks of the same size, results are good. If you see fluctation in their size, results aren't so reliable, and something affects benchmarks too much. To fix it, you can try to explicitly set a higher value for loops argument.
  • 10_022 ops: a single loop executed 10022 opcodes. Read the section above to learn more about opcodes.
  • 23 ns/op: execution of each opcode took on average 23 nanoseconds.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

true-north-0.2.0.tar.gz (341.2 kB view details)

Uploaded Source

Built Distribution

true_north-0.2.0-py3-none-any.whl (11.8 kB view details)

Uploaded Python 3

File details

Details for the file true-north-0.2.0.tar.gz.

File metadata

  • Download URL: true-north-0.2.0.tar.gz
  • Upload date:
  • Size: 341.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.25.1

File hashes

Hashes for true-north-0.2.0.tar.gz
Algorithm Hash digest
SHA256 7988ab2af14a898c9bc6913365aacbf3660abbf01611d719705265d4942ee3fd
MD5 2b354a8d3126d589efd29837ae322195
BLAKE2b-256 d9687e655a45538a8a34fcb0fdb63c32aacea2e521c13bb83c984d297336c555

See more details on using hashes here.

File details

Details for the file true_north-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: true_north-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 11.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-requests/2.25.1

File hashes

Hashes for true_north-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2a0ee45ff6b7ec71d610339c8251b796749be2dd4950d8489b988efc70221a86
MD5 7178e6f0f8f839bd19df24f3578dc21e
BLAKE2b-256 2e4d5e258b4887cb890d773bbb7802a79fe2d9928f25e8f33740d2adb8000a34

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page