Skip to main content

Reduce multiple TensorBoard runs to new event (or CSV) files

Project description

TensorBoard Reducer

Tests pre-commit.ci status PyPI This project supports Python 3.8+

This project was inspired by tensorboard-aggregator (similar project built for TensorFlow rather than PyTorch) and this SO answer.

Compute reduced statistics (mean, std, min, max, median or any other numpy operation) of multiple TensorBoard runs matching a directory glob pattern. This can be used e.g. when training multiple identical models (such as deep ensembles) to reduce the noise in their loss/accuracy/error curves and establish statistical significance in performance improvements. The aggregation results can be saved to disk either as new TensorBoard event files or as CSV.

Requires PyTorch and TensorBoard. No TensorFlow installation required.

Installation

pip install tensorboard-reducer

Usage

CLI

tb-reducer -i 'glob-pattern/of-dirs-to-reduce*' -o output-dir -r mean,std,min,max

Note: By default, TensorBoard Reducer expects event files containing identical tags and equal number of steps for all scalars. If e.g. you trained one model for 300 epochs and another for 400 and/or added different sets of tags, see flags --lax-tags and --lax-tags to remove this restriction.

Mean of 3 TensorBoard logs

tb-reducer has the following flags:

  • -i/--indirs-glob (required): Glob pattern of the run directories to reduce. Remember to protect wildcards * with quotes to avoid shell expansion.
  • -o/--outpath (required): File or directory where to save output on disk. Will save as a CSV file if path ends in '.csv' extension or else as TensorBoard run directories, one for each reduce op suffixed by the op's name, e.g. 'outpath-mean', 'outpath-max', etc. If output format is CSV, a single file will be created with two-level header containing one column for each combination of tag and reduce operation. Tag names will be in top-level header, reduce op in second level.
  • -r/--reduce-ops (optional, default: mean): Comma-separated names of numpy reduction ops (mean, std, min, max, ...). Each reduction is written to a separate outpath suffixed by its op name. E.g. if outpath='reduced-run', the mean reduction will be written to 'reduced-run-mean'.
  • -f/--overwrite (optional, default: False): Whether to overwrite existing output directories/CSV files.
  • --lax-tags (optional, default: False): Allow different runs have to different sets of tags. In this mode, each tag reduction will run over as many runs as are available for a given tag, even if that's just one. Proceed with caution as not all tags will have the same statistics in downstream analysis.
  • --lax-steps (optional, default: False): Allow tags across different runs to have unequal numbers of steps. In this mode, each reduction will only use as many steps as are available in the shortest run (same behavior as zip(short_list, long_list)).
  • --handle-dup-steps (optional, default: None): How to handle duplicate values recorded for the same tag and step in a single run. One of 'keep-first', 'keep-last', 'mean'. 'keep-first/last' will keep the first/last occurrence of duplicate steps while 'mean' compute their mean. Default behavior is to raise an error on duplicate steps.
  • --min-runs-per-step (optional, default: None): Minimum number of runs across which a given step must be recorded to be kept. Steps present across less runs are dropped. Only plays a role if strict_steps=False. Warning: Be aware that with this setting, you'll be reducing variable number of runs, however many recorded a value for a given step as long as there are at least --min-runs-per-step. In other words, the statistics of a reduction will change mid-run. Say you're plotting the mean of an error curve, the sample size of that mean will drop from, say, 10 down to 4 mid-plot if 4 of your models trained for longer than the rest. Be sure to remember when using this.

Note: Use pandas.read_csv("path/to/file.csv", header=[0, 1], index_col=0) to read CSV data back into a multi-index dataframe.

Python API

You can also import tensorboard_reducer into a Python script for more complex operations. A simple example that makes use of the full Python API (load_tb_events, reduce_events, write_csv, write_tb_events) to get you started:

from tensorboard_reducer import load_tb_events, reduce_events, write_csv, write_tb_events

in_dirs_glob = "glob_pattern/of_directories_to_reduce*"
out_dir = "path/to/output_dir"
out_csv = "path/to/out.csv"
overwrite = False
reduce_ops = ["mean", "min", "max"]

events_dict = load_tb_events(in_dirs_glob)

n_steps, n_events = list(events_dict.values())[0].shape
n_scalars = len(events_dict)

print(
    f"Loaded {n_events} TensorBoard runs with {n_scalars} scalars and {n_steps} steps each"
)
for tag in events_dict.keys():
    print(f" - {tag}")

reduced_events = reduce_events(events_dict, reduce_ops)

for op in reduce_ops:
    print(f"Writing '{op}' reduction to '{out_dir}-{op}'")

write_tb_events(reduced_events, out_dir, overwrite)

write_csv(reduced_events, out_csv, overwrite)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tensorboard-reducer-0.2.3.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

tensorboard_reducer-0.2.3-py2.py3-none-any.whl (14.6 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file tensorboard-reducer-0.2.3.tar.gz.

File metadata

  • Download URL: tensorboard-reducer-0.2.3.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.8.5

File hashes

Hashes for tensorboard-reducer-0.2.3.tar.gz
Algorithm Hash digest
SHA256 b60d08443f0876d7ef529224782ef4d744b6721fb6c332e000b175a8aabd908a
MD5 a8d3be9e75d3f9795e4baf8aff84afc5
BLAKE2b-256 8e19f11e0fd8d595a44ff10b1cd1f059e4207048fc4cc62ecac3229776467834

See more details on using hashes here.

File details

Details for the file tensorboard_reducer-0.2.3-py2.py3-none-any.whl.

File metadata

  • Download URL: tensorboard_reducer-0.2.3-py2.py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.0 CPython/3.8.5

File hashes

Hashes for tensorboard_reducer-0.2.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 df8017ce9ad66cc322d1e7696d8628ac0884817a1ff9501c2ad758c1aeef0de2
MD5 c812ba1c79336e1ab02e1eaa093de5af
BLAKE2b-256 dc0f32abecb6ff9860cba3ff55c02c4112efed0e85c2499ea83d6ade3b3d5a1b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page