Skip to main content

Parse S3 logs to more easily calculate usage metrics per asset.

Project description

DANDI S3 Log Parser

codecov

TODO: update these on first release PyPI latest release version Ubuntu Supported Python versions License: BSD-3

Python code style: Black Python code style: Ruff

Simple reductions of consolidated S3 logs (consolidation step not included in this repository) into minimal information for public sharing and plotting.

Developed for the DANDI Archive.

Usage

To iteratively parse all historical logs all at once (parallelization with 10-15 total GB recommended):

parse_all_dandi_raw_s3_logs \
  --base_raw_s3_log_folder_path < base log folder > \
  --parsed_s3_log_folder_path < output folder > \
  --excluded_ips < comma-separated list of known IPs to exclude > \
  --maximum_number_of_workers < number of CPUs to use > \
  --maximum_buffer_size_in_bytes < approximate amount of RAM to use >

For example, on Drogon:

parse_all_dandi_raw_s3_logs \
  --base_raw_s3_log_folder_path /mnt/backup/dandi/dandiarchive-logs \
  --parsed_s3_log_folder_path /mnt/backup/dandi/dandiarchive-logs-cody/parsed_7_13_2024/GET_per_asset_id \
  --excluded_ips < Drogons IP > \
  --maximum_number_of_workers 3 \
  --maximum_buffer_size_in_bytes 15000000000

To parse only a single log file at a time, such as in a CRON job:

parse_dandi_raw_s3_log \
  --raw_s3_log_file_path < s3 log file path > \
  --parsed_s3_log_folder_path < output folder > \
  --excluded_ips < comma-separated list of known IPs to exclude >

Submit line decoding errors

Please email line decoding errors collected from your local config file to the core maintainer before raising issues or submitting PRs contributing them as examples, to more easily correct any aspects that might require anonymization.

Developer notes

.log file suffixes should typically be ignored when working with Git, so when committing changes to the example log collection, you will have to forcibly include it with

git add -f <example file name>.log

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dandi_s3_log_parser-0.0.1.tar.gz (23.7 kB view details)

Uploaded Source

Built Distribution

dandi_s3_log_parser-0.0.1-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file dandi_s3_log_parser-0.0.1.tar.gz.

File metadata

  • Download URL: dandi_s3_log_parser-0.0.1.tar.gz
  • Upload date:
  • Size: 23.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for dandi_s3_log_parser-0.0.1.tar.gz
Algorithm Hash digest
SHA256 ec4d2c0419904bc06d673144a45c95c77c5085ab59740991859858c4a0eccd25
MD5 8a1b2718c60ce6c8e28447d3a35cb5eb
BLAKE2b-256 8c1c50028b8bde1305d2196afa8265092c497b3de269caf666d524948bfb0215

See more details on using hashes here.

File details

Details for the file dandi_s3_log_parser-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for dandi_s3_log_parser-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f2989c912c7a24bd3c0a7c899a0e685999b2e8ad225fcd9323012cdac0f8b99e
MD5 7796403a098804f2ce927b7fb315c97d
BLAKE2b-256 033f1681dc8c58d1e9a9292447c92afa19061fb9c073c3d0cb21d502b1a40175

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page