Parse S3 logs to more easily calculate usage metrics per asset.
Project description
DANDI S3 Log Parser
TODO: update these on first release
Simple reductions of consolidated S3 logs (consolidation step not included in this repository) into minimal information for public sharing and plotting.
Developed for the DANDI Archive.
Usage
To iteratively parse all historical logs all at once (parallelization with 10-15 total GB recommended):
parse_all_dandi_raw_s3_logs \
--base_raw_s3_log_folder_path < base log folder > \
--parsed_s3_log_folder_path < output folder > \
--excluded_ips < comma-separated list of known IPs to exclude > \
--maximum_number_of_workers < number of CPUs to use > \
--maximum_buffer_size_in_bytes < approximate amount of RAM to use >
For example, on Drogon:
parse_all_dandi_raw_s3_logs \
--base_raw_s3_log_folder_path /mnt/backup/dandi/dandiarchive-logs \
--parsed_s3_log_folder_path /mnt/backup/dandi/dandiarchive-logs-cody/parsed_7_13_2024/GET_per_asset_id \
--excluded_ips < Drogons IP > \
--maximum_number_of_workers 3 \
--maximum_buffer_size_in_bytes 15000000000
To parse only a single log file at a time, such as in a CRON job:
parse_dandi_raw_s3_log \
--raw_s3_log_file_path < s3 log file path > \
--parsed_s3_log_folder_path < output folder > \
--excluded_ips < comma-separated list of known IPs to exclude >
Submit line decoding errors
Please email line decoding errors collected from your local config file to the core maintainer before raising issues or submitting PRs contributing them as examples, to more easily correct any aspects that might require anonymization.
Developer notes
.log
file suffixes should typically be ignored when working with Git, so when committing changes to the example log collection, you will have to forcibly include it with
git add -f <example file name>.log
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dandi_s3_log_parser-0.0.1.tar.gz
.
File metadata
- Download URL: dandi_s3_log_parser-0.0.1.tar.gz
- Upload date:
- Size: 23.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ec4d2c0419904bc06d673144a45c95c77c5085ab59740991859858c4a0eccd25 |
|
MD5 | 8a1b2718c60ce6c8e28447d3a35cb5eb |
|
BLAKE2b-256 | 8c1c50028b8bde1305d2196afa8265092c497b3de269caf666d524948bfb0215 |
File details
Details for the file dandi_s3_log_parser-0.0.1-py3-none-any.whl
.
File metadata
- Download URL: dandi_s3_log_parser-0.0.1-py3-none-any.whl
- Upload date:
- Size: 23.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.19
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f2989c912c7a24bd3c0a7c899a0e685999b2e8ad225fcd9323012cdac0f8b99e |
|
MD5 | 7796403a098804f2ce927b7fb315c97d |
|
BLAKE2b-256 | 033f1681dc8c58d1e9a9292447c92afa19061fb9c073c3d0cb21d502b1a40175 |