Skip to main content

Functional Analysis Description Language - Backend AST Manipulation Packages

Project description

func_adl

Construct hierarchical data queries using SQL-like concepts in python.

GitHub Actions Status Code Coverage

PyPI version Supported Python versions

func_adl Uses an SQL like language, and extracts data and computed values from a ROOT file or an ATLAS xAOD file and returns them in a columnar format. It is currently used as a central part of two of the ServiceX transformers.

This is the base package that has the backend-agnostic code to query hierarchical data. In all likelihood you will want to install one of the following packages:

  • func_adl_xAOD: for running on an ATLAS & CMS experiment xAOD file hosted in ServiceX
  • func_adl_uproot: for running on flat root files
  • func_adl.xAOD.backend: for running on a local file using docker

See the documentation for more information on what expressions and capabilities are possible in each of these backends.

Extensibility

There are two several extensibility points:

  • EventDataset should be sub-classed to provide an executor.
  • EventDataset can use Python's type system to allow for editors and other intelligent typing systems to type check expressions. The more type data present, the more the system can help.
  • It is possible to insert a call back at a function or method call site that will allow for modification of the ObjectStream or the call site's ast.

EventDataSet

An example EventDataSet:

class events(EventDataset):
    async def execute_result_async(self, a: ast.AST, title: Optional[str] = None):
        await asyncio.sleep(0.01)
        return a

and some func_adl code that uses it:

    r = (events()
         .SelectMany(lambda e: e.Jets('jets'))
         .Select(lambda j: j.eta())
         .value())
  • When the .value() method is invoked, the execute_result_async with a complete ast representing the query is called. This is the point that one would send it to the backend to actually be processed.
  • Normally, the constructor of events would take in the name of the dataset to be processed, which could then be used in execute_result_async.

Typing EventDataset

A minor change to the declaration above, and no change to the query:

class dd_jet:
    def pt(self) -> float:
        ...

    def eta(self) -> float:
        ...

class dd_event:
    def Jets(self, bank: str) -> Iterable[dd_jet]:
        ...
    
    def EventNumber(self, bank='default') -> int
        ...

class events(EventDataset[dd_event]):
    async def execute_result_async(self, a: ast.AST, title: Optional[str] = None):
        await asyncio.sleep(0.01)
        return a

This is not required, but when this is done:

  • Editors that use types to give one a list of options/guesses will now light up as long as they have reasonable type-checking built in.
  • If a required argument is missed, an error will be generated
  • If a default argument is missed, it will be automatically filled in.

It should be noted that the type and expression follower is not very sophisticated! While it can follow method calls, it won't follow much else!

Type-based callbacks

By adding a function and a reference in the type system, arbitrary code can be executed during the traversing of the func_adl. Keeping the query the same and the events definition the same, we can add the info directly to the python type declarations:

def add_md_for_type(s: ObjectStream[T], a: ast.Call) -> Tuple[ObjectStream[T], ast.AST]:
    return s.MetaData({'hi': 'there'}), a


class dd_event:
    _func_adl_type_info = add_md_for_type

    def Jets(self, bank: str) -> Iterable[dd_jet]:
        ...
  • When the .Jets() method is processed, the add_md_for_type is called with the current object stream and the ast.
  • add_md_for_type here adds metadata and returns the updated stream and ast.
  • Nothing prevents the function from parsing the AST, removing or adding arguments, adding more complex metadata, or doing any of this depending on the arguments in the call site.

Development

After a new release has been built and passes the tests you can release it by creating a new release on github. An action that runs when a release is "created" will send it to pypi.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

func_adl.ast-2.4b6.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

func_adl.ast-2.4b6-py3-none-any.whl (25.4 kB view details)

Uploaded Python 3

File details

Details for the file func_adl.ast-2.4b6.tar.gz.

File metadata

  • Download URL: func_adl.ast-2.4b6.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for func_adl.ast-2.4b6.tar.gz
Algorithm Hash digest
SHA256 a4d150e0841f2ebdd9f38549eb750e4c2f5f7c5c89b98bcc9cdfcbfcf63a4bab
MD5 3d69bf0e79cebedecd2e5ba9c38254ef
BLAKE2b-256 ec3497e569e5851b2585e31f728a058d82a5b11419a6366cfb82776702e38114

See more details on using hashes here.

File details

Details for the file func_adl.ast-2.4b6-py3-none-any.whl.

File metadata

  • Download URL: func_adl.ast-2.4b6-py3-none-any.whl
  • Upload date:
  • Size: 25.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for func_adl.ast-2.4b6-py3-none-any.whl
Algorithm Hash digest
SHA256 db3709a880d725ca21b09739039d5c0ee658813572d3befc54eb856f45056086
MD5 45eb9560e75de7c881a31eca5c53b970
BLAKE2b-256 ca0e7baef28b5f69f97f7bfda0da77da013bb587deb0717f822846a149f5d55c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page