Skip to main content

Adjustment Identification Distance: A ๐š๐šŠ๐š๐š“๐š’๐š for Causal Structure Learning

Project description

Adjustment Identification Distance: A ๐š๐šŠ๐š๐š“๐š’๐š for Causal Structure Learning

This is an early release of ๐š๐šŠ๐š๐š“๐š’๐š ๐Ÿฅ and feedback is very welcome! Just open an issue on github.

If you publish research using ๐š๐šŠ๐š๐š“๐š’๐š, please cite our article

@article{henckel2024adjustment,
    title = {{Adjustment Identification Distance: A gadjid for Causal Structure Learning}},
    author = {Leonard Henckel and Theo Wรผrtzen and Sebastian Weichwald},
    journal = {{arXiv preprint arXiv:2402.08616}},
    year = {2024},
    doi = {10.48550/arXiv.2402.08616},
}

Get Started Real Quick ๐Ÿš€ โ€“ Introductory Example

Just pip install gadjid to install the latest release of ๐š๐šŠ๐š๐š“๐š’๐š
and run python -c "import gadjid; help(gadjid)" to get started (or see install alternatives).

import gadjid
from gadjid import example, ancestor_aid, oset_aid, parent_aid, shd
import numpy as np

help(gadjid)

example.run_parent_aid()

Gtrue = np.array([
    [0, 1, 1, 1, 1],
    [0, 0, 1, 1, 1],
    [0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0]
], dtype=np.int8)
Gguess = np.array([
    [0, 0, 1, 1, 1],
    [1, 0, 1, 1, 1],
    [0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0],
    [0, 0, 0, 0, 0]
], dtype=np.int8)

print(ancestor_aid(Gtrue, Gguess, edge_direction="from row to column"))
print(shd(Gtrue, Gguess))

๐š๐šŠ๐š๐š“๐š’๐š is implemented in Rust and can conveniently be called from Python via our Python wrapper (implemented using maturin and PyO3).

Evaluating graphs learned by causal discovery algorithms is difficult: The number of edges that differ between two graphs does not reflect how the graphs differ with respect to the identifying formulas they suggest for causal effects. We introduce a framework for developing causal distances between graphs which includes the structural intervention distance for directed acyclic graphs as a special case. We use this framework to develop improved adjustment-based distances as well as extensions to completed partially directed acyclic graphs and causal orders. We develop new reachability algorithms to compute the distances efficiently and to prove their low polynomial time complexity. In our package gadjid, we provide implementations of our distances; they are orders of magnitude faster with proven lower time complexity than the structural intervention distance and thereby provide a success metric for causal discovery that scales to graph sizes that were previously prohibitive.

Parallelism โ€“ setting the number of threads

๐š๐šŠ๐š๐š“๐š’๐š uses rayon for parallelism using, per default, as many threads as there are physical CPU cores. The number of threads to use can be set via the environment variable RAYON_NUM_THREADS. We recommend to do so and to set the number of threads manually, not least to be explicit and to avoid the small runtime overhead for determining the number of physical CPU cores.

Implemented Distances

  • ancestor_aid(Gtrue, Gguess, edge_direction)
  • oset_aid(Gtrue, Gguess, edge_direction)
  • parent_aid(Gtrue, Gguess, edge_direction)
  • for convenience, the following distances are implemented, too
    • shd(Gtrue, Gguess)
    • sid(Gtrue, Gguess, edge_direction) โ€“ only for DAGs!

where Gtrue and Gguess are adjacency matrices of a DAG or CPDAG and edge_direction determines whether a 1 at r-th row and c-th column of an adjacency matrix codes the edge r โ†’ c (edge_direction="from row to column") or c โ†’ r (edge_direction="from column to row"). The functions are not symmetric in their inputs: To calculate a distance, identifying formulas for causal effects are inferred in the graph Gguess and verified against the graph Gtrue. Distances return a tuple (normalised_distance, mistake_count) of the fraction of causal effects inferred in Gguess that are wrong relative to Gtrue, normalised_distance, and the number of wrongly inferred causal effects, mistake_count. There are $p(p-1)$ pairwise causal effects to infer in graphs with $p$ nodes and we define normalisation as normalised_distance = mistake_count / p(p-1).

You may also calculate the SID between DAGs via parent_aid(DAGtrue, DAGguess, edge_direction), but we recommend ancestor_aid and oset_aid and for CPDAG inputs the parent_aid does not coincide with the SID (see also our accompanying article).

If edge_direction="from row to column", then a 1 in row r and column c codes a directed edge r โ†’ c; if edge_direction="from column to row", then a 1 in row r and column c codes a directed edge c โ†’ r; for either setting of edge_direction, a 2 in row r and column c codes an undirected edge r โ€“ c (an additional 2 in row c and column r is ignored; one of the two entries is sufficient to code an undirected edge).

An adjacency matrix for a DAG may only contain 0s and 1s. An adjacency matrix for a CPDAG may only contain 0s, 1s and 2s. DAG and CPDAG inputs are validated for acyclicity. However, for CPDAG inputs, the user needs to ensure the adjacency matrix indeed codes a valid CPDAG (instead of just a PDAG).

Empirical Runtime Analysis

Experiments run on a laptop with 8 GB RAM and 4-core i5-8365U processor. Here, for a graph with $p$ nodes, sparse graphs have $10p$ edges in expectation, dense graphs have $0.3p(p-1)/2$ edges in expectation, and x-sparse graphs have $0.75p$ edges in expectation.

Maximum graph size feasible within 1 minute

Method sparse dense
Parent-AID 13601 962
Ancestor-AID 8211 932
Oset-AID 1105 508
SID in R 256 239

Results obtained with ๐š๐šŠ๐š๐š“๐š’๐š v0.1.0 using the Python interface and the SID R package v1.1 from CRAN.

Average runtime

Method x-sparse ($p=1000$) sparse ($p=256$) dense ($p=239$)
Parent-AID 7.3 ms 30.5 ms 173 ms
Ancestor-AID 3.4 ms 40.9 ms 207 ms
Oset-AID 5.0 ms 567 ms 1.68 s
SID in R ~1โ€“2 h ~60 s ~60 s

Results obtained with ๐š๐šŠ๐š๐š“๐š’๐š v0.1.0 using the Python interface and the SID R package v1.1 from CRAN.

LICENSE

๐š๐šŠ๐š๐š“๐š’๐š is available in source code form at https://github.com/CausalDisco/gadjid.

This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at https://mozilla.org/MPL/2.0/.

See also the MPL-2.0 FAQ.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gadjid-0.1.0.tar.gz (45.5 kB view details)

Uploaded Source

Built Distributions

gadjid-0.1.0-pp310-pypy310_pp73-win_amd64.whl (230.8 kB view details)

Uploaded PyPy Windows x86-64

gadjid-0.1.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (368.9 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

gadjid-0.1.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (370.8 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

gadjid-0.1.0-pp310-pypy310_pp73-macosx_11_0_arm64.whl (310.2 kB view details)

Uploaded PyPy macOS 11.0+ ARM64

gadjid-0.1.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl (324.4 kB view details)

Uploaded PyPy macOS 10.12+ x86-64

gadjid-0.1.0-pp39-pypy39_pp73-win_amd64.whl (230.8 kB view details)

Uploaded PyPy Windows x86-64

gadjid-0.1.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (368.9 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

gadjid-0.1.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (370.8 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

gadjid-0.1.0-pp39-pypy39_pp73-macosx_11_0_arm64.whl (310.2 kB view details)

Uploaded PyPy macOS 11.0+ ARM64

gadjid-0.1.0-pp39-pypy39_pp73-macosx_10_12_x86_64.whl (324.4 kB view details)

Uploaded PyPy macOS 10.12+ x86-64

gadjid-0.1.0-cp38-abi3-win_arm64.whl (220.1 kB view details)

Uploaded CPython 3.8+ Windows ARM64

gadjid-0.1.0-cp38-abi3-win_amd64.whl (232.2 kB view details)

Uploaded CPython 3.8+ Windows x86-64

gadjid-0.1.0-cp38-abi3-musllinux_1_1_x86_64.whl (539.3 kB view details)

Uploaded CPython 3.8+ musllinux: musl 1.1+ x86-64

gadjid-0.1.0-cp38-abi3-musllinux_1_1_aarch64.whl (549.3 kB view details)

Uploaded CPython 3.8+ musllinux: musl 1.1+ ARM64

gadjid-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (369.4 kB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

gadjid-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (371.4 kB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ ARM64

gadjid-0.1.0-cp38-abi3-macosx_11_0_arm64.whl (311.1 kB view details)

Uploaded CPython 3.8+ macOS 11.0+ ARM64

gadjid-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl (324.7 kB view details)

Uploaded CPython 3.8+ macOS 10.12+ x86-64

File details

Details for the file gadjid-0.1.0.tar.gz.

File metadata

  • Download URL: gadjid-0.1.0.tar.gz
  • Upload date:
  • Size: 45.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.7.0

File hashes

Hashes for gadjid-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3a74c07960ee92b42ffc342dfd25a3c48cbbc2a375881401f83b223a8af93959
MD5 1f0b11794087c811f561c627bc095f7b
BLAKE2b-256 fedbf85da1d0aeaf34d3c5ff00485c35f359cc96c4ca9263e42b1a4fc10a55f9

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-pp310-pypy310_pp73-win_amd64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-pp310-pypy310_pp73-win_amd64.whl
Algorithm Hash digest
SHA256 98e9491e54238ed0199b012223cbbbf7ec9d052ab92be1f4b718c6047b4c6639
MD5 77daaeec5c4154e82c76fc9d24dc2c3f
BLAKE2b-256 d34c10529ef353f4e10201fe275603f65c91e90dba1bdf691637ac4f79ceffc5

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0957e4f563ad915ec92b86b0ef7a3e9a4729275900de0e2d49a1a5106cc86553
MD5 86d84e6d729a3abdfbb4aa96b79724af
BLAKE2b-256 a6b703786cbf7b835a30dc1e1d84735969f517a3d47efe7ad3e5f8957581e66a

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 233f0938afeb3221a304984e8d42724f98bd8663ef149d9b6b2c88ac62e077b5
MD5 53a81e8475ddb03c33cfebba3dd48d21
BLAKE2b-256 24ebacf7474953b28ca5e4bb9b145daee2385914100d98c8c67b28c683bf9a2f

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-pp310-pypy310_pp73-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-pp310-pypy310_pp73-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b26dee51744893bc9b62549f40fbdf62d7458652d8eb3c030ffc5dd74929a0b6
MD5 2c41f0563778fdf84e6346fc92d80457
BLAKE2b-256 856c460fd1ebe9996d56d00781c774cbf8cd84a8bc8a0e7c8ccf63edb6044337

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-pp310-pypy310_pp73-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 49ec4c724b05d8ad1eb55d405734a87e025147289f2d92499de0312a0a5831a3
MD5 87f0cda09a7d244d1349502c41125aea
BLAKE2b-256 0d644871e4215c8f3a1656c5a88b295a59330589b12d672577c4c6cf15251d6e

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-pp39-pypy39_pp73-win_amd64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-pp39-pypy39_pp73-win_amd64.whl
Algorithm Hash digest
SHA256 f70c8d9f372f779d89c1677ebc32ff09cfde71451a0fb448b319b2beaef61df3
MD5 4042afb238cff743995af8175be8725e
BLAKE2b-256 c1ddced718adf6558d5230c3eb10287d1c30d022b8465303948e3b5f9b290492

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 aebb0c97e2062619b7967a54913f75d44facf678de2e5b69ce5803198932f0eb
MD5 15e20336c116342827fa796a1e0eb2e5
BLAKE2b-256 bb6ae40aa34237ee47ca114aaed3ec1b9eb8dcf5038273a0d549f0d24dcde57e

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ba289c6db73e0347978609475dbf6d655afb1153793a69e745e55f2f142402ae
MD5 51cb2a0460c7420d4dfd961ea0a06a65
BLAKE2b-256 9b6c7e23a5bf5851966aad69a8bd69d05bc2d9f656011627eb67957fc1ca2390

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-pp39-pypy39_pp73-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-pp39-pypy39_pp73-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3bce721d9de32fec20821e3d3cd5d342324ba72e395caf3dcb6eab0b6780458c
MD5 3926c3a7307f4826f2bd28726d19b47a
BLAKE2b-256 d896bd24333ac74e013048c1b0dd07599b3cee16c8e03e9c3b578a4d220b40f6

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-pp39-pypy39_pp73-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-pp39-pypy39_pp73-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 10ca98e04f129c6410eb7620edf7cbdd46637e53844084bf5e428284ae854416
MD5 cc1961b2fbd7b03d10b1dc2b28244ee0
BLAKE2b-256 fbd136cc0822e63e75a2050e0d220124212d02493348ce3d19f24f1bc9e0ffbd

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-cp38-abi3-win_arm64.whl.

File metadata

  • Download URL: gadjid-0.1.0-cp38-abi3-win_arm64.whl
  • Upload date:
  • Size: 220.1 kB
  • Tags: CPython 3.8+, Windows ARM64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.7.0

File hashes

Hashes for gadjid-0.1.0-cp38-abi3-win_arm64.whl
Algorithm Hash digest
SHA256 1f45ff3032f7e80b04cd83bace38c4dad935ae7ae0dc52271a4e91d1b397a162
MD5 7d8c1581b18d2f30c05f52eb8c3b8942
BLAKE2b-256 a500864fef364c43cfa2690f7eb093a254bb17ceb3eb213a8642b8404e0e4dea

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: gadjid-0.1.0-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 232.2 kB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.7.0

File hashes

Hashes for gadjid-0.1.0-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 551e132268ffac3215c920032112ace7ed39436f369e170e43fa126d694486c8
MD5 7bc8116a41ee33dbf2f89f9c07b262c6
BLAKE2b-256 07e1f27eb262ebc05837202d40d6dc93853649d03ffcc57272d64207c7d93608

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-cp38-abi3-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-cp38-abi3-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 d9310080a90fc4aaded93d081023b5bb9080590c0df24c66439cda178d003288
MD5 639b5beb26ec71d04ee4c4caf0b101ca
BLAKE2b-256 aed2a851c5f685d6f61a445a44f5019076a3f77c0070dbd3a2c9b76e6eba2e15

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-cp38-abi3-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-cp38-abi3-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 936513302decb33831d9403d7d21f0dc1f995d739ddc1e1984960c35e1737d2e
MD5 52f27c085b395ac2dd2fb42ce2e67b61
BLAKE2b-256 a389bf47e9ac904b96fd9a2920084928de77fc134ee5fbcada281b6a6cf573e4

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3d734b47d544a466b16f0a64370d28df771e8d86e9ab074f3aae40fb19a13689
MD5 82b86d2c93ce59f299a6d36d30f247ec
BLAKE2b-256 ccea0bfd2c3211fad72934dc70e6d899f9aa74d728d19d3f1eb2d7256e9ec74b

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ffa49fe6fd891f1200d4d07ed6a6475ba485c1f28a3e78a8e59201687471dc75
MD5 204f711df2d84d3797ed4f1f0852e66c
BLAKE2b-256 81dc87938a5abe33c557d57cbd5804b35c76e42ebefcaaf3f77dee09da3fa4f7

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 18c70e77af471c7d885c9e721b1755065733c82602fc897fcdb981e0e92f0752
MD5 ebfa6ac0a5bcbc651349c985fb65db27
BLAKE2b-256 32daf863cefd025484f242b292d9d70d85bbdd3af027ad3681aa50c80009597b

See more details on using hashes here.

Provenance

File details

Details for the file gadjid-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for gadjid-0.1.0-cp38-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 fc237ad67c4bef18752700b861c3e47346b64579f132b33e4ea9504d60378ca7
MD5 1382d2d22d10529677d4b33523daeaaf
BLAKE2b-256 1618e0f14798a1fb70b1c001f7c76e38a529af2c75d5ee693a05596d281848a8

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page