A package for training and evaluating multimodal knowledge graph embeddings

These details have not been verified by PyPI

Project links

Project description

PyKEEN

PyKEEN (Python KnowlEdge EmbeddiNgs) is a Python package designed to train and evaluate knowledge graph embedding models (incorporating multi-modal information).

Installation • Quickstart • Datasets • Models • Support • Citation

Installation

The latest stable version of PyKEEN can be downloaded and installed from PyPI with:

$ pip install pykeen

The latest version of PyKEEN can be installed directly from the source on GitHub with:

$ pip install git+https://github.com/pykeen/pykeen.git

More information about installation (e.g., development mode, Windows installation, Colab, Kaggle, extras) can be found in the installation documentation.

Quickstart

This example shows how to train a model on a dataset and test on another dataset.

The fastest way to get up and running is to use the pipeline function. It provides a high-level entry into the extensible functionality of this package. The following example shows how to train and evaluate the TransE model on the Nations dataset. By default, the training loop uses the stochastic local closed world assumption (sLCWA) training approach and evaluates with rank-based evaluation.

from pykeen.pipeline import pipeline

result = pipeline(
    model='TransE',
    dataset='nations',
)

The results are returned in an instance of the PipelineResult dataclass that has attributes for the trained model, the training loop, the evaluation, and more. See the tutorials on understanding the evaluation and making novel link predictions.

PyKEEN is extensible such that:

Each model has the same API, so anything from pykeen.models can be dropped in
Each training loop has the same API, so pykeen.training.LCWATrainingLoop can be dropped in
Triples factories can be generated by the user with from pykeen.triples.TriplesFactory

The full documentation can be found at https://pykeen.readthedocs.io.

Implementation

Below are the models, datasets, training modes, evaluators, and metrics implemented in pykeen.

Datasets (26)

The citation for each dataset corresponds to either the paper describing the dataset, the first paper published using the dataset with knowledge graph embedding models, or the URL for the dataset if neither of the first two are available.

Name	Documentation	Citation	Entities	Relations	Triples
Clinical Knowledge Graph	`pykeen.datasets.CKG`	Santos et al., 2020	7617419	11	26691525
CN3l Family	`pykeen.datasets.CN3l`	Chen et al., 2017	3206	42	21777
CoDEx (large)	`pykeen.datasets.CoDExLarge`	Safavi et al., 2020	77951	69	612437
CoDEx (medium)	`pykeen.datasets.CoDExMedium`	Safavi et al., 2020	17050	51	206205
CoDEx (small)	`pykeen.datasets.CoDExSmall`	Safavi et al., 2020	2034	42	36543
ConceptNet	`pykeen.datasets.ConceptNet`	Speer et al., 2017	28370083	50	34074917
Countries	`pykeen.datasets.Countries`	Bouchard et al., 2015	271	2	1158
Commonsense Knowledge Graph	`pykeen.datasets.CSKG`	Ilievski et al., 2020	2087833	58	4598728
DB100K	`pykeen.datasets.DB100K`	Ding et al., 2018	99604	470	697479
DBpedia50	`pykeen.datasets.DBpedia50`	Shi et al., 2017	24624	351	34421
Drug Repositioning Knowledge Graph	`pykeen.datasets.DRKG`	`gnn4dr/DRKG`	97238	107	5874257
FB15k	`pykeen.datasets.FB15k`	Bordes et al., 2013	14951	1345	592213
FB15k-237	`pykeen.datasets.FB15k237`	Toutanova et al., 2015	14505	237	310079
Hetionet	`pykeen.datasets.Hetionet`	Himmelstein et al., 2017	45158	24	2250197
Kinships	`pykeen.datasets.Kinships`	Kemp et al., 2006	104	25	10686
Nations	`pykeen.datasets.Nations`	`ZhenfengLei/KGDatasets`	14	55	1992
OGB BioKG	`pykeen.datasets.OGBBioKG`	Hu et al., 2020	45085	51	5088433
OGB WikiKG	`pykeen.datasets.OGBWikiKG`	Hu et al., 2020	2500604	535	17137181
OpenBioLink	`pykeen.datasets.OpenBioLink`	Breit et al., 2020	180992	28	4563407
OpenBioLink	`pykeen.datasets.OpenBioLinkLQ`	Breit et al., 2020	480876	32	27320889
Unified Medical Language System	`pykeen.datasets.UMLS`	`ZhenfengLei/KGDatasets`	135	46	6529
WK3l-120k Family	`pykeen.datasets.WK3l120k`	Chen et al., 2017	119748	3109	1375406
WK3l-15k Family	`pykeen.datasets.WK3l15k`	Chen et al., 2017	15126	1841	209041
WordNet-18	`pykeen.datasets.WN18`	Bordes et al., 2014	40943	18	151442
WordNet-18 (RR)	`pykeen.datasets.WN18RR`	Toutanova et al., 2015	40559	11	92583
YAGO3-10	`pykeen.datasets.YAGO310`	Mahdisoltani et al., 2015	123143	37	1089000

Models (28)

Name	Reference	Citation
CompGCN	`pykeen.models.CompGCN`	Vashishth et al., 2020
ComplEx	`pykeen.models.ComplEx`	Trouillon et al., 2016
ComplExLiteral	`pykeen.models.ComplExLiteral`	Kristiadi et al., 2018
ConvE	`pykeen.models.ConvE`	Dettmers et al., 2018
ConvKB	`pykeen.models.ConvKB`	Nguyen et al., 2018
CrossE	`pykeen.models.CrossE`	Zhang et al., 2019
DistMult	`pykeen.models.DistMult`	Yang et al., 2014
DistMultLiteral	`pykeen.models.DistMultLiteral`	Kristiadi et al., 2018
ERMLP	`pykeen.models.ERMLP`	Dong et al., 2014
ERMLPE	`pykeen.models.ERMLPE`	Sharifzadeh et al., 2019
HolE	`pykeen.models.HolE`	Nickel et al., 2016
KG2E	`pykeen.models.KG2E`	He et al., 2015
MuRE	`pykeen.models.MuRE`	Balažević et al., 2019
NTN	`pykeen.models.NTN`	Socher et al., 2013
PairRE	`pykeen.models.PairRE`	Chao et al., 2020
ProjE	`pykeen.models.ProjE`	Shi et al., 2017
QuatE	`pykeen.models.QuatE`	Zhang et al., 2019
RESCAL	`pykeen.models.RESCAL`	Nickel et al., 2011
RGCN	`pykeen.models.RGCN`	Schlichtkrull et al., 2018
RotatE	`pykeen.models.RotatE`	Sun et al., 2019
SimplE	`pykeen.models.SimplE`	Kazemi et al., 2018
StructuredEmbedding	`pykeen.models.StructuredEmbedding`	Bordes et al., 2011
TransD	`pykeen.models.TransD`	Ji et al., 2015
TransE	`pykeen.models.TransE`	Bordes et al., 2013
TransH	`pykeen.models.TransH`	Wang et al., 2014
TransR	`pykeen.models.TransR`	Lin et al., 2015
TuckER	`pykeen.models.TuckER`	Balažević et al., 2019
UnstructuredModel	`pykeen.models.UnstructuredModel`	Bordes et al., 2014

Losses (7)

Name	Reference	Description
bceaftersigmoid	`pykeen.losses.BCEAfterSigmoidLoss`	A module for the numerically unstable version of explicit Sigmoid + BCE loss.
bcewithlogits	`pykeen.losses.BCEWithLogitsLoss`	A module for the binary cross entropy loss.
crossentropy	`pykeen.losses.CrossEntropyLoss`	A module for the cross entropy loss that evaluates the cross entropy after softmax output.
marginranking	`pykeen.losses.MarginRankingLoss`	A module for the margin ranking loss.
mse	`pykeen.losses.MSELoss`	A module for the mean square error loss.
nssa	`pykeen.losses.NSSALoss`	An implementation of the self-adversarial negative sampling loss function proposed by [sun2019]_.
softplus	`pykeen.losses.SoftplusLoss`	A module for the softplus loss.

Regularizers (5)

Name	Reference	Description
combined	`pykeen.regularizers.CombinedRegularizer`	A convex combination of regularizers.
lp	`pykeen.regularizers.LpRegularizer`	A simple L_p norm based regularizer.
no	`pykeen.regularizers.NoRegularizer`	A regularizer which does not perform any regularization.
powersum	`pykeen.regularizers.PowerSumRegularizer`	A simple x^p based regularizer.
transh	`pykeen.regularizers.TransHRegularizer`	A regularizer for the soft constraints in TransH.

Optimizers (6)

Name	Reference	Description
adadelta	`torch.optim.Adadelta`	Implements Adadelta algorithm.
adagrad	`torch.optim.Adagrad`	Implements Adagrad algorithm.
adam	`torch.optim.Adam`	Implements Adam algorithm.
adamax	`torch.optim.Adamax`	Implements Adamax algorithm (a variant of Adam based on infinity norm).
adamw	`torch.optim.AdamW`	Implements AdamW algorithm.
sgd	`torch.optim.SGD`	Implements stochastic gradient descent (optionally with momentum).

Training Loops (2)

Name	Reference	Description
lcwa	`pykeen.training.LCWATrainingLoop`	A training loop that uses the local closed world assumption training approach.
slcwa	`pykeen.training.SLCWATrainingLoop`	A training loop that uses the stochastic local closed world assumption training approach.

Negative Samplers (3)

Name	Reference	Description
basic	`pykeen.sampling.BasicNegativeSampler`	A basic negative sampler.
bernoulli	`pykeen.sampling.BernoulliNegativeSampler`	An implementation of the Bernoulli negative sampling approach proposed by [wang2014]_.
pseudotyped	`pykeen.sampling.PseudoTypedNegativeSampler`	A sampler that accounts for which entities co-occur with a relation.

Stoppers (2)

Name	Reference	Description
early	`pykeen.stoppers.EarlyStopper`	A harness for early stopping.
nop	`pykeen.stoppers.NopStopper`	A stopper that does nothing.

Evaluators (2)

Name	Reference	Description
rankbased	`pykeen.evaluation.RankBasedEvaluator`	A rank-based evaluator for KGE models.
sklearn	`pykeen.evaluation.SklearnEvaluator`	An evaluator that uses a Scikit-learn metric.

Metrics (16)

Name	Description
AUC-ROC	The area under the ROC curve, on [0, 1]. Higher is better.
Adjusted Arithmetic Mean Rank (AAMR)	The mean over all chance-adjusted ranks, on (0, 2). Lower is better.
Adjusted Arithmetic Mean Rank Index (AAMRI)	The re-indexed adjusted mean rank (AAMR), on [-1, 1]. Higher is better.
Average Precision	The area under the precision-recall curve, on [0, 1]. Higher is better.
Geometric Mean Rank (GMR)	The geometric mean over all ranks, on [1, inf). Lower is better.
Harmonic Mean Rank (HMR)	The harmonic mean over all ranks, on [1, inf). Lower is better.
Hits @ K	The relative frequency of ranks not larger than a given k, on [0, 1]. Higher is better
Inverse Arithmetic Mean Rank (IAMR)	The inverse of the arithmetic mean over all ranks, on (0, 1]. Higher is better.
Inverse Geometric Mean Rank (IGMR)	The inverse of the geometric mean over all ranks, on (0, 1]. Higher is better.
Inverse Median Rank	The inverse of the median over all ranks, on (0, 1]. Higher is better.
Mean Rank (MR)	The arithmetic mean over all ranks on, [1, inf). Lower is better.
Mean Reciprocal Rank (MRR)	The inverse of the harmonic mean over all ranks, on (0, 1]. Higher is better.
Median Rank	The median over all ranks, on [1, inf). Lower is better.

Trackers (7)

Name	Reference	Description
console	`pykeen.trackers.ConsoleResultTracker`	A class that directly prints to console.
csv	`pykeen.trackers.CSVResultTracker`	Tracking results to a CSV file.
json	`pykeen.trackers.JSONResultTracker`	Tracking results to a JSON lines file.
mlflow	`pykeen.trackers.MLFlowResultTracker`	A tracker for MLflow.
neptune	`pykeen.trackers.NeptuneResultTracker`	A tracker for Neptune.ai.
tensorboard	`pykeen.trackers.TensorBoardResultTracker`	A tracker for TensorBoard.
wandb	`pykeen.trackers.WANDBResultTracker`	A tracker for Weights and Biases.

Hyper-parameter Optimization

Samplers (3)

Name	Reference	Description
grid	`optuna.samplers.GridSampler`	Sampler using grid search.
random	`optuna.samplers.RandomSampler`	Sampler using random sampling.
tpe	`optuna.samplers.TPESampler`	Sampler using TPE (Tree-structured Parzen Estimator) algorithm.

Any sampler class extending the optuna.samplers.BaseSampler, such as their sampler implementing the CMA-ES algorithm, can also be used.

Experimentation

Reproduction

PyKEEN includes a set of curated experimental settings for reproducing past landmark experiments. They can be accessed and run like:

$ pykeen experiments reproduce tucker balazevic2019 fb15k

Where the three arguments are the model name, the reference, and the dataset. The output directory can be optionally set with -d.

Ablation

PyKEEN includes the ability to specify ablation studies using the hyper-parameter optimization module. They can be run like:

$ pykeen experiments ablation ~/path/to/config.json

Large-scale Reproducibility and Benchmarking Study

We used PyKEEN to perform a large-scale reproducibility and benchmarking study which are described in our article:

@article{ali2020benchmarking,
  title={Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework},
  author={Ali, Mehdi and Berrendorf, Max and Hoyt, Charles Tapley and Vermue, Laurent and Galkin, Mikhail and Sharifzadeh, Sahand and Fischer, Asja and Tresp, Volker and Lehmann, Jens},
  journal={arXiv preprint arXiv:2006.13365},
  year={2020}
}

We have made all code, experimental configurations, results, and analyses that lead to our interpretations available at https://github.com/pykeen/benchmarking.

Contributing

Contributions, whether filing an issue, making a pull request, or forking, are appreciated. See CONTRIBUTING.md for more information on getting involved.

Acknowledgements

Supporters

This project has been supported by several organizations (in alphabetical order):

Funding

The development of PyKEEN has been funded by the following grants:

Funding Body	Program	Grant
DARPA	Automating Scientific Knowledge Extraction (ASKE)	HR00111990009
German Federal Ministry of Education and Research (BMBF)	Maschinelles Lernen mit Wissensgraphen (MLWin)	01IS18050D
German Federal Ministry of Education and Research (BMBF)	Munich Center for Machine Learning (MCML)	01IS18036A
Innovation Fund Denmark (Innovationsfonden)	Danish Center for Big Data Analytics driven Innovation (DABAI)	Grand Solutions

Logo

The PyKEEN logo was designed by Carina Steinborn

Citation

If you have found PyKEEN useful in your work, please consider citing our article:

@article{ali2021pykeen,
    author = {Ali, Mehdi and Berrendorf, Max and Hoyt, Charles Tapley and Vermue, Laurent and Sharifzadeh, Sahand and Tresp, Volker and Lehmann, Jens},
    journal = {Journal of Machine Learning Research},
    number = {82},
    pages = {1--6},
    title = {{PyKEEN 1.0: A Python Library for Training and Evaluating Knowledge Graph Embeddings}},
    url = {http://jmlr.org/papers/v22/20-825.html},
    volume = {22},
    year = {2021}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.11.0

Oct 29, 2024

1.10.3.dev0 pre-release

Oct 29, 2024

1.10.2

Feb 19, 2024

1.10.1

Feb 22, 2023

1.10.0

Jan 31, 2023

1.10.0.dev0 pre-release

Jan 31, 2023

1.9.0

Aug 4, 2022

1.8.2

May 24, 2022

1.8.1

Apr 20, 2022

1.8.0

Mar 22, 2022

1.7.0

Jan 11, 2022

1.6.0

Oct 18, 2021

This version

1.5.0

Jun 13, 2021

1.4.0

Mar 4, 2021

1.3.0

Feb 15, 2021

1.2.0

Feb 12, 2021

1.1.0

Jan 20, 2021

1.0.5

Oct 21, 2020

1.0.4

Aug 25, 2020

1.0.3

Aug 13, 2020

1.0.2

Jul 10, 2020

1.0.1

Jul 2, 2020

1.0.0

Jun 25, 2020

0.0.26

Aug 13, 2019

0.0.25

Apr 11, 2019

0.0.24

Apr 11, 2019

0.0.23

Apr 4, 2019

0.0.22

Apr 2, 2019

0.0.21

Apr 1, 2019

0.0.20

Apr 1, 2019

0.0.20.dev0 pre-release

Apr 1, 2019

0.0.19

Jan 30, 2019

0.0.18

Jan 18, 2019

0.0.17

Jan 18, 2019

0.0.16

Dec 23, 2018

0.0.15

Dec 12, 2018

0.0.14

Nov 26, 2018

0.0.13

Nov 21, 2018

0.0.12

Nov 19, 2018

0.0.11

Nov 19, 2018

0.0.10

Nov 7, 2018

0.0.8

Oct 28, 2018

0.0.7

Oct 23, 2018

0.0.6

Oct 18, 2018

0.0.5

Oct 18, 2018

0.0.4

Oct 17, 2018

0.0.3

Oct 12, 2018

0.0.2

Oct 10, 2018

0.0.1

Oct 9, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pykeen-1.5.0.tar.gz (1.5 MB view hashes)

Uploaded Jun 13, 2021 Source

Built Distribution

pykeen-1.5.0-py3-none-any.whl (465.4 kB view hashes)

Uploaded Jun 13, 2021 Python 3

Hashes for pykeen-1.5.0.tar.gz

Hashes for pykeen-1.5.0.tar.gz
Algorithm	Hash digest
SHA256	`32fbe26b584ce14a5d31ba6ec3cd662ff46982ae5739fbedf04bd03b4989b55b`
MD5	`313a612dda425f8b1099961b9f6c72f6`
BLAKE2b-256	`5d405f0be2da06bba395e8228c4f3e742616512aced22124b1416f5845ec8759`

Hashes for pykeen-1.5.0-py3-none-any.whl

Hashes for pykeen-1.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8b21ace98d616272a6544820290e6a12c95166387fb2c2d0c331afaf7bb332bf`
MD5	`d8c7ae3339074eca32945505b9b363a9`
BLAKE2b-256	`733eb1f693bd8600dc13ab4f6ee1c1f6cca728bfca2099fff42002878cb12b2e`

pykeen 1.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

PyKEEN

Installation

Quickstart

Implementation

Datasets (26)

Models (28)

Losses (7)

Regularizers (5)

Optimizers (6)

Training Loops (2)

Negative Samplers (3)

Stoppers (2)

Evaluators (2)

Metrics (16)

Trackers (7)

Hyper-parameter Optimization

Samplers (3)

Experimentation

Reproduction

Ablation

Large-scale Reproducibility and Benchmarking Study

Contributing

Acknowledgements

Supporters

Funding

Logo

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution