Skip to main content

A library of high-performant graph algorithms.

Project description

graph_mate

Python binding for the set of graph crates.

graph_mate is a Python API that provides a collection of high-performant graph algorithms. It provides implementations for directed and undirected graphs.

Graphs can be created programatically or read from the Graph500 input format.

The library is implemented in Rust and uses rayon for running graph creation and algorithm execution in parallel without holding on to the Global Interpreter Lock or using multiprocessing.

The graph itself is implemented as a Compressed-Sparse-Row (CSR) data structure which is tailored for fast and concurrent access to the graph topology.

graph_mate provides a few graph algorithms. The algorithm implementations are designed to run efficiently on large-scale graphs with billions of nodes and edges.

Note: The development is mainly driven by Neo4j developers. However, the library is not an official product of Neo4j.

Usage

Installation

graph_mate is available on PyPI:

pip install graph-mate

It is currently build for x86_64 on Windows/Mac/Linux and as universal for Apple Silicon Macs.

If you need to use graph_mate for a different architecture or system, please refer to the manual installation.

Usage

What is a graph?

A graph consists of nodes and edges where edges connect exactly two nodes. A graph can be either directed, i.e., an edge has a source and a target node or undirected where there is no such distinction.

In a directed graph, each node u has outgoing and incoming neighbors. An outgoing neighbor of node u is any node v for which an edge (u, v) exists. An incoming neighbor of node u is any node v for which an edge (v, u) exists.

In an undirected graph there is no distinction between source and target node. A neighbor of node u is any node v for which either an edge (u,v) or (v, u) exists.

How to create graphs

Currently there are two ways to build a graph. You can either load graphs in the Graph500 format, for example by downloading them from the LDBC Graphalytics site. Alternatively, you can provide a numpy edge list using from_numpy.

import graph_mate as gm
import numpy as np

# Let's load a small graph:
#    (a)-->(b)-->(c)-->(d), (a)-->(c), (b)-->(d)
# To load from an edge list, we need to create a 2d numpy array of `uint32`s.
edge_list = np.array([
    # (a)-->(b)
    [0, 1],
    # (a)-->(c)
    [0, 2],
    # (b)-->(c)
    [1, 2],
    # (b)-->(d)
    [1, 3],
    # (c)-->(d)
    [2, 3]
], dtype=np.uint32)

To build a directed graph, you would create a graph_mate.DiGraph and an undirected on with graph_mate.Graph.

# We can load a directed graph from the edge list
directed = gm.DiGraph.from_numpy(edge_list)

# Or we can load an undirected graph
undirected = gm.Graph.from_numpy(edge_list)

To make assertions easier, we can create graphs with a sorted adjacency list by providing an optional second argument of type graph_mate.Layout.

directed = gm.DiGraph.from_numpy(edge_list, gm.Layout.Sorted)

undirected = gm.Graph.from_numpy(edge_list, gm.Layout.Sorted)

When loading from a numpy edge list, the data is not shared but copied into the graph. The numpy arrays can be deleted afterwards.

We can inspect the graph with a few methods.

assert directed.node_count() == 4
assert directed.edge_count() == 5

assert directed.out_degree(1) == 2
assert directed.in_degree(1) == 1

assert np.array_equal(directed.out_neighbors(1), [2, 3])
assert np.array_equal(directed.in_neighbors(1), [0])

Neighbors are returned as a numpy array view into the graph, which means we cannot modify the array.

neighbors = directed.out_neighbors(1)
try:
    neighbors[0] = 42
except ValueError as e:
    assert str(e) == "assignment destination is read-only"

In order to use the neighbors as a Python list and not a numpy array, we can use copy_ methods.

neighbors = directed.copy_out_neighbors(1)

assert neighbors == [2, 3]
assert type(neighbors) == list

For undirected graphs, we don't have directional methods for the degree or the neighbors.

assert undirected.node_count() == 4
assert undirected.edge_count() == 5

assert undirected.degree(1) == 3

assert np.array_equal(undirected.neighbors(1), [0, 2, 3])

How to run algorithms

In the following we will demonstrate running Page Rank, a graph algorithm to determine the importance of nodes in a graph based on the number and quality of their incoming edges. Page Rank requires a directed graph and returns the rank value for each node.

# https://en.wikipedia.org/wiki/PageRank#/media/File:PageRanks-Example.svg

graph = gm.DiGraph.from_numpy(np.array([
    (1,2), # B->C
    (2,1), # C->B
    (4,0), # D->A
    (4,1), # D->B
    (5,4), # E->D
    (5,1), # E->B
    (5,6), # E->F
    (6,1), # F->B
    (6,5), # F->E
    (7,1), # G->B
    (7,5), # F->E
    (8,1), # G->B
    (8,5), # G->E
    (9,1), # H->B
    (9,5), # H->E
    (10,1), # I->B
    (10,5), # I->E
    (11,5), # J->B
    (12,5), # K->B
], dtype=np.uint32))

pr_result = graph.page_rank(max_iterations=10, tolerance=1e-4, damping_factor=0.85)

assert pr_result.ran_iterations == 10

expected = np.array([
    0.024064068,
    0.3145448,
    0.27890152,
    0.01153846,
    0.029471997,
    0.06329483,
    0.029471997,
    0.01153846,
    0.01153846,
    0.01153846,
    0.01153846,
    0.01153846,
    0.01153846,
], dtype=np.float32)

assert np.array_equal(pr_result.scores(), expected)

Example Notebooks

For more examples and demos, please refer to the notebooks in the notebooks directory.

Developing

The Python extension is written using PyO3 together with maturin.

One-time setup

# Run the command from the extension directory, not the git root
# cd crates/mate

python -m venv .env
source .env/bin/activate
pip install -r requirements/dev.txt

Once-per-new-terminal setup

Make sure that you're activating the venv in every new terminal where you want to develop.

source .env/bin/activate

Building the extension

Build in debug mode.

maturin develop

Build in release mode.

maturin develop --release

Rebuild the extension in release mode 2 seconds after the last file change. This is an optional step.

cargo watch --shell 'maturin develop --release' --delay 2

Testing

Running the tests

pytest tests

Formatting and linting

# Runs code formatter https://pypi-hypernode.com/project/black/
black tests

# Sort imports using https://pypi-hypernode.com/project/isort/
isort tests

# Verify with https://pypi-hypernode.com/project/flake8/
flake8 tests

# Very types using http://mypy-lang.org
mypy .

License: MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graph_mate-0.1.0.tar.gz (82.7 kB view details)

Uploaded Source

Built Distributions

graph_mate-0.1.0-cp38-abi3-win_amd64.whl (329.1 kB view details)

Uploaded CPython 3.8+ Windows x86-64

graph_mate-0.1.0-cp38-abi3-win32.whl (308.1 kB view details)

Uploaded CPython 3.8+ Windows x86

graph_mate-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.17+ x86-64

graph_mate-0.1.0-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl (1.4 MB view details)

Uploaded CPython 3.8+ manylinux: glibc 2.12+ i686

graph_mate-0.1.0-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl (941.8 kB view details)

Uploaded CPython 3.8+ macOS 10.9+ universal2 (ARM64, x86-64) macOS 10.9+ x86-64 macOS 11.0+ ARM64

graph_mate-0.1.0-cp38-abi3-macosx_10_7_x86_64.whl (489.4 kB view details)

Uploaded CPython 3.8+ macOS 10.7+ x86-64

File details

Details for the file graph_mate-0.1.0.tar.gz.

File metadata

  • Download URL: graph_mate-0.1.0.tar.gz
  • Upload date:
  • Size: 82.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for graph_mate-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6fdb907fef3d33568fd90790e140cb401a94f65ee72d60e45d29f1f9e96b39ec
MD5 a1be06112666b66b58ae454c46543f5b
BLAKE2b-256 aaf3cc5fe98ed322cc41151869af6ba11057080f9c623be36dfedd833e297332

See more details on using hashes here.

File details

Details for the file graph_mate-0.1.0-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: graph_mate-0.1.0-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 329.1 kB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for graph_mate-0.1.0-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 9796d011de8dac474be38f54e29d51ebd95d04fb8e645f6539e08fa08895a9a6
MD5 04221e183047297fbe6f51d9bd714a24
BLAKE2b-256 219bd52057c2191a79438061cf0e7454919e28b910db3183b69d618ece451b0d

See more details on using hashes here.

File details

Details for the file graph_mate-0.1.0-cp38-abi3-win32.whl.

File metadata

  • Download URL: graph_mate-0.1.0-cp38-abi3-win32.whl
  • Upload date:
  • Size: 308.1 kB
  • Tags: CPython 3.8+, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for graph_mate-0.1.0-cp38-abi3-win32.whl
Algorithm Hash digest
SHA256 bfc3ebfdea2cd014540b27d119244add4cd6764a9697248100afed73720f271a
MD5 7c1f28544a8091422053100a9c99d2a9
BLAKE2b-256 4cdfe559c4bf24ddcdb0d1148747ff510e04b5c08bdff3fdaf3b72b277b45cea

See more details on using hashes here.

File details

Details for the file graph_mate-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graph_mate-0.1.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9f3a197ce13715d4229041b9c61ba31e8a1009690927771d1b3910d02f179f84
MD5 c00c047c7de32d187c5960a126cdf682
BLAKE2b-256 1924e3373f4317822b5cc086ea25b37e9748b29cec55e65335d694334f5dedd0

See more details on using hashes here.

File details

Details for the file graph_mate-0.1.0-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl.

File metadata

File hashes

Hashes for graph_mate-0.1.0-cp38-abi3-manylinux_2_12_i686.manylinux2010_i686.whl
Algorithm Hash digest
SHA256 b8b2993d3001d7d9468d2c631527599c3860b8591b45b8fd41d834bcdf8ab3a6
MD5 777cd0c6d76c4a84daf8841dc2e68a7c
BLAKE2b-256 c04ae90948b37f6c35a4b14c01850a013a3cfe0a9619d3d9fbeeb2713839357e

See more details on using hashes here.

File details

Details for the file graph_mate-0.1.0-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for graph_mate-0.1.0-cp38-abi3-macosx_10_9_x86_64.macosx_11_0_arm64.macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 2cb1f401d61543815f91cb6bdb8a3375e9c57bd3bf70fda7901098bf12331bb7
MD5 6ec796fa1cb6a160dc14d3d4c57d2377
BLAKE2b-256 7acf1a8d88bfbe77e37ad4d546a86dcf2126cccffd565cf6215f07fda21333ce

See more details on using hashes here.

File details

Details for the file graph_mate-0.1.0-cp38-abi3-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for graph_mate-0.1.0-cp38-abi3-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 872fb99566af5aedc639efa234eefdd7436ae2c92285835b21bbfcee9d81107e
MD5 fd7fd8a5e816cd6537bd298b135d7fab
BLAKE2b-256 58b1f543cc8faa4ca8a62d91133ad2cf2cdd8329dac3a796f9b5673581613e62

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page