A set of tools for estimating LHCb PID efficiencies
Project description
PIDCalib2
A set of software tools for estimating LHCb PID efficiencies.
The package includes several user-callable modules:
make_eff_hists
creates histograms that can be used to estimate the PID efficiency of a user's sampleref_calib
calculates the LHCb PID efficiency of a user reference samplemerge_trees
merges two ROOT files with compatibleTTree
spklhisto2root
converts Pickled boost-histograms to ROOT histograms
Setup
When working on a computer that where the LHCb software stack is available (LXPLUS, some university cluster, etc.), one can setup PIDCalib2 by running
lb-conda pidcalib bash
After this, the following commands will be available
pidcalib2.make_eff_hists
pidcalib2.ref_calib
pidcalib2.merge_trees
pidcalib2.pklhisto2root
To run make_eff_hists
, you will need access to CERN EOS. You don't need to do anything special on LXPLUS. On other machines, you will usually need to obtain a Kerberos ticket by running
kinit [username]@CERN.CH
Installing from PyPI
The PIDCalib2 package is available on PyPI. It can be installed on any computer via pip
simply by running (preferably in a virtual environment; see venv
)
pip install pidcalib2
Note that this will install the xrootd
Python bindings. One also has to install XRootD itself for the bindings to work. See this page for XRootD releases and instructions.
make_eff_hists
This module creates histograms that can be used to estimate the PID efficiency of a user's sample.
Reading all the relevant calibration files can take a long time. When running a configuration for the first time, we recommend using the --max-files 1
option. This will limit PIDCalib2 to reading just a single calibration file. Such a test will reveal any problems with, e.g., missing variables quickly. Keep in mind that you might get a warning about empty bins in the total histogram as you are reading a small subset of the calibration data. For the purposes of a quick test, this warning can be safely ignored.
Options
To get a usage message listing all the options, their descriptions, and default values, type
pidcalib2.make_eff_hists --help
The calibration files to be processed are determined by the sample
, magnet
, and particle
options. All the valid combinations can be listed by running
pidcalib2.make_eff_hists --list configs
Aliases for standard variables are defined to simplify the commands. We recommend users use only the aliases when specifying variables. When you use a name that isn't an alias, a warning message like the following will show up in the log. Use with caution.
'probe_PIDK' is not a known PID variable alias, using raw variable
All aliases can be listed by running
pidcalib2.make_eff_hists --list aliases
A file with alternative binnings can be specified using --binning-file
. The file must contain valid JSON specifying bin edges. For example, a two-bin binning for particle Pi
, variable P
can be defined as
{"Pi": {"P": [10000, 15000, 30000]}}
An arbitrary number of binnings can be defined in a single file.
Complex cut expressions can be created by chaining simpler expressions using &
. One can also use standard mathematical symbols, like *
, /
, +
, -
, (
, )
. Whitespace does not matter.
Examples
-
Create a single efficiency histogram for a single PID cut
pidcalib2.make_eff_hists --sample Turbo18 --magnet up --particle Pi --pid-cut "DLLK > 4" --bin-var P --bin-var ETA --bin-var nSPDhits --output-dir pidcalib_output
-
Create multiple histograms in one run (most of the time is spent reading in data, so specifying multiple cuts is much faster than running make_eff_hists sequentially)
pidcalib2.make_eff_hists --sample Turbo16 --magnet up --particle Pi --pid-cut "DLLK > 0" --pid-cut "DLLK > 4" --pid-cut "DLLK > 6" --bin-var P --bin-var ETA --bin-var nSPDhits --output-dir pidcalib_output
-
Create a single efficiency histogram for a complex PID cut
pidcalib2.make_eff_hists --sample Turbo18 --magnet up --particle Pi --pid-cut "MC15TuneV1_ProbNNp*(1-MC15TuneV1_ProbNNpi)*(1-MC15TuneV1_ProbNNk) < 0.5 & DLLK < 3" --cut "isMuon==0" --bin-var P --bin-var ETA --bin-var nSPDhits --output-dir pidcalib_output
ref_calib
This module uses the histograms created by make_eff_hists
to assign efficiency to events in a reference sample supplied by the user. Adding of efficiency to the user-supplied file requires PyROOT and is optional.
The module works in two steps:
- Calculate the efficiency and save it as a TTree in a separate file.
- Optionally copy the efficiency TTree to the reference file and make it a friend of the user's TTree. The user must request the step by specifying
--merge
on the command line.
Be aware that --merge
will modify your file. Use with caution.
Options
The sample
and magnet
options are used solely to select the correct PID efficiency histograms. They should therefore mirror the options used when running make_eff_hists
.
bin-vars
must be a dictionary that relates the binning variables (or aliases) used to make the efficiency histograms with the variables in the reference sample. We assume that the reference sample branch names have the format [ParticleName]_[VariableName]
. E.g., D0_K_calcETA
, corresponds to a particle named D0_K
and variable calcETA
. If the user wants to estimate PID efficiency of their sample using 1D binning, where calcETA
corresponds to the ETA
binning variable alias of the calibration sample, they should specify --bin-vars '{"ETA": "calcETA"}'
.
ref-pars
must be a dictionary of particles from the reference sample to apply cuts to. The keys represent the particle branch name prefix (D0_K
in the previous example), and the values passed are a list containing particle type and PID cut, e.g. '{"D0_K" : ["K", "DLLK > 4"], "D0_Pi" : ["Pi", "DLLK < 4"]}'
.
The --merge
option will copy the PID efficiency tree to your input file and make the PID efficiency tree a "Friend" of your input tree. Then you can treat your input tree as if it had the PID efficiency branches itself. E.g., input_tree->Draw("PIDCalibEff")
should work. ROOT's "Friend" mechanism is an efficient way to add branches from one tree to another. Take a look here if you would like to know more.
Examples
- Evaluate efficiency of a single PID cut and save it to
user_ntuple_PID_eff.root
without adding it touser_ntuple.root
python -m pidcalib2.ref_calib --sample Turbo18 --magnet up --ref-file data/user_ntuple.root --output-dir pidcalib_output --bin-vars '{"P": "mom", "ETA": "Eta", "nSPDHits": "nSPDhits"}' --ref-pars '{"Bach": ["K", "DLLK > 4"]}'
- Evaluate efficiency of a single PID cut and add it to the reference file
user_ntuple.root
python -m pidcalib2.ref_calib --sample Turbo18 --magnet up --ref-file data/user_ntuple.root --output-dir pidcalib_output --bin-vars '{"P": "mom", "ETA": "Eta", "nSPDHits": "nSPDhits"}' --ref-pars '{"Bach": ["K", "DLLK > 4"]}' --merge
- Evaluate efficiency of multiple PID cuts and add them to the reference file
python -m pidcalib2.ref_calib --sample Turbo18 --magnet up --ref-file data/user_ntuple.root --output-dir pidcalib_output --bin-vars '{"P": "P", "ETA": "ETA", "nSPDHits": "nSPDHits"}' --ref-pars '{"Bach": ["K", "DLLK > 4"], "SPi": ["Pi", "DLLK < 0"]}' --merge
Caveats
You might notice that some of the events in you reference sample are assigned PIDCalibEff
, PIDCalibErr
, or both of -999.
PIDCalibEff
is -999 when for at least one particle- The event is out of range
- The relevant bin in the efficiency histogram has no events whatsoever
PIDCalibErr
is -999 when for at least one particle- The event is out of range
- The relevant bin in the efficiency histogram has no events whatsoever
- The relevant bin in the efficiency histogram has no events passing PID cuts
Development
- Clone the repository from GitLab
- (Optional) Set up a virtual environment
python3 -m venv .venv source .venv/bin/activate
- Install pinned dependencies
pip install -r requirements-dev.txt
- Install
xrootd
(possibly manually; see this issue) - Run the tests
pytest
- Run the modules
python3 -m src.pidcalib2.make_eff_hists -h
Tips
Certain tests can be excluded like this
pytest -m "not xrootd"
See available tags in the src/pidcalib2/tests/test_*.py
files.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pidcalib2-0.5.0.tar.gz
.
File metadata
- Download URL: pidcalib2-0.5.0.tar.gz
- Upload date:
- Size: 125.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e63aaf005ee745712ccf5190afcf5543401ed97f45a9a80900c7ba5cd46c36c |
|
MD5 | 428fafdefb03eba9919af528f0844bf9 |
|
BLAKE2b-256 | 3a052957ddfd3d8d0a576d06e0f76eb0c2607ed6a1ed9263b6d62bfe17938186 |
File details
Details for the file pidcalib2-0.5.0-py3-none-any.whl
.
File metadata
- Download URL: pidcalib2-0.5.0-py3-none-any.whl
- Upload date:
- Size: 123.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.7.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 30b2d11bf0050307171df33bc5f45fc5e82ff9e8073e804dda5860a0b6747798 |
|
MD5 | f5cb8791f9b100462d1d08dc42d1057c |
|
BLAKE2b-256 | e180374a504b37e21725f9a4124df521f4e1ac21c8973ecd7ffc379eb78cc069 |