Skip to main content

Submit Functional Queries to a ServiceX endpoint.

Project description

func_adl_servicex

Send func_adl expressions to a ServiceX endpoint

GitHub Actions Status Code Coverage

PyPI version Supported Python versions

Introduction

This package contains the single object ServiceXSourceXAOD and ServiceXSourceUpROOT which can be used as a root of a func_adl expression to query large LHC datasets from an active ServiceX instance located on the net.

See below for simple examples.

Further Information

  • servicex documentation
  • func_adl documentation

Usage

To use func_adl on servicex, the only func_adl package you only need to install this package. All others required will be pulled in as dependencies of this package.

Using the xAOD backend

See the further information for documentation above to understand how this works. Here is a quick sample that will run against an ATLAS xAOD backend in servicex to get out jet pt's for those jets with pt > 30 GeV.

from func_adl_servicex import ServiceXSourceXAOD

dataset_xaod = "mc15_13TeV:mc15_13TeV.361106.PowhegPythia8EvtGen_AZNLOCTEQ6L1_Zee.merge.DAOD_STDM3.e3601_s2576_s2132_r6630_r6264_p2363_tid05630052_00"
ds = ServiceXSourceXAOD(dataset_xaod)
data = (
    ds
    .SelectMany('lambda e: (e.Jets("AntiKt4EMTopoJets"))')
    .Where('lambda j: (j.pt()/1000)>30')
    .Select('lambda j: j.pt()')
    .AsAwkwardArray(["JetPt"])
    .value()
)

print(data['JetPt'])

Using the CMS Run 1 AOD backend

See the further information for documentation above to understand how this works. Here is a quick sample that will run against an CMS Run 1 AOD backend in servicex. It turns against a 6 TB CMS Open Data dataset, selecting global muons with a pT greater than 30 GeV.

from func_adl_servicex import ServiceXSourceCMSRun1AOD

dataset_xaod = "cernopendata://16"
ds = ServiceXSourceCMSRun1AOD(dataset_xaod)
data = (
    ds
    .SelectMany(lambda e: e.TrackMuons("globalMuons"))
    .Where(lambda m: m.pt() > 30)
    .Select(lambda m: m.pt())
    .AsAwkwardArray(['mu_pt'])
    .value()
)

print(data['mu_pt'])

Using the uproot backend

See the further information for documentation above to understand how this works. Here is a quick sample that will run against a ROOT file (TTree) in the uproot backend in servicex to get out jet pt's. Note that the image name tag is likely wrong here. See XXX to get the current one.

from servicex import ServiceXDataset
from func_adl_servicex import ServiceXSourceUpROOT


dataset_uproot = "user.kchoi:user.kchoi.ttHML_80fb_ttbar"
uproot_transformer_image = "sslhep/servicex_func_adl_uproot_transformer:issue6"

sx_dataset = ServiceXDataset(dataset_uproot, image=uproot_transformer_image)
ds = ServiceXSourceUpROOT(sx_dataset, "nominal")
data = (
    ds.Select("lambda e: {
        'lep_pt_1': e.lep_Pt_1,
        'lep_pt_2': e.lep_Pt_2
        }")
    .value()

print(data)

Running on Local Datasets

It is possible to run on local files. This works well when testing or building out your code, but is horrible if you need to run on a large number of files. It is recommended to use this only with a single file. It is, for the most part, a drop-in replacement for the ServiceX backend version.

First, you must install the local variant of func_adl_servicex. If you are using pip, you can do the following:

pip install func_adl_servicex[local]

With that installed, the following will work:

from func_adl_servicex import SXLocalxAOD

dataset_xaod = "my_local_xaod.root"
ds = SXLocalxAOD(dataset_xaod)
data = (ds
    .SelectMany('lambda e: (e.Jets("AntiKt4EMTopoJets"))')
    .Where('lambda j: (j.pt()/1000)>30')
    .Select('lambda j: j.pt()')
    .AsAwkwardArray(["JetPt"])
    .value()
)

print(data['JetPt'])

And replace SXLocalxAOD with SXLocalCMSRun1AOD for using CMS backend (and, of course, update the query).

Development

PR's are welcome! Feel free to add an issue for new features or questions.

The master branch is the most recent commits that both pass all tests and are slated for the next release. Releases are tagged. Modifications to any released versions are made off those tags.

Qastle

This is for people working with the back-ends that run in servicex.

This is the qastle produced for an xAOD dataset:

(call EventDataset 'ServiceXDatasetSource')

(the actual dataset name is passed in the servicex web API call.)

This is the qastle produced for a ROOT flat file:

(call EventDataset 'ServiceXDatasetSource' 'tree_name')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

func_adl_servicex-2.1b1.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

func_adl_servicex-2.1b1-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file func_adl_servicex-2.1b1.tar.gz.

File metadata

  • Download URL: func_adl_servicex-2.1b1.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.15

File hashes

Hashes for func_adl_servicex-2.1b1.tar.gz
Algorithm Hash digest
SHA256 74154bcce6c436ef9b34e2618f08f95ea0a1d101e0e39da9c3765c256299fbce
MD5 d550b248ef2ec6badce5cffdc5969923
BLAKE2b-256 2a77962722f806cdf8a2f43efd643c9d164250e55121a43a4d3dbe955938454e

See more details on using hashes here.

Provenance

File details

Details for the file func_adl_servicex-2.1b1-py3-none-any.whl.

File metadata

File hashes

Hashes for func_adl_servicex-2.1b1-py3-none-any.whl
Algorithm Hash digest
SHA256 76c69ab05ba414af93e683421d6a73e82ab98212aa11b3ce8581286fd8ad4582
MD5 bd7bb756700c51690e2d218c17f8ffd4
BLAKE2b-256 d674dfac003c0d33a0bb624a9a6da6e1e2f4723f4f7ca435d01bf48816cf2912

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page