A tool for learning vector representations of words and entities from Wikipedia
Project description
Wikipedia2Vec
=============
[![Fury badge](https://badge.fury.io/py/wikipedia2vec.png)](http://badge.fury.io/py/wikipedia2vec)
[![CircleCI](https://circleci.com/gh/wikipedia2vec/wikipedia2vec.svg?style=svg)](https://circleci.com/gh/wikipedia2vec/wikipedia2vec)
Wikipedia2Vec is a tool used for obtaining embeddings (vector representations) of words and entities from Wikipedia.
It is developed and maintained by [Studio Ousia](http://www.ousia.jp).
This tool enables you to learn embeddings that map words and entities into a unified continuous vector space.
The embeddings can be used as word embeddings, entity embeddings, and the unified embeddings of words and entities.
They are used in the state-of-the-art models of various tasks such as [entity linking](https://arxiv.org/abs/1601.01343), [named entity recognition](http://www.aclweb.org/anthology/I17-2017), [entity relatedness](https://arxiv.org/abs/1601.01343), and [question answering](https://arxiv.org/abs/1803.08652).
Documentation and pretrained embeddings are available online at [http://wikipedia2vec.github.io/](http://wikipedia2vec.github.io/).
Reference
---------
If you use Wikipedia2Vec in a scientific publication, please cite the following paper:
@InProceedings{yamada-EtAl:2016:CoNLL,
author = {Yamada, Ikuya and Shindo, Hiroyuki and Takeda, Hideaki and Takefuji, Yoshiyasu},
title = {Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation},
booktitle = {Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning},
month = {August},
year = {2016},
address = {Berlin, Germany},
pages = {250--259},
publisher = {Association for Computational Linguistics}
}
License
-------
[Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0)
=============
[![Fury badge](https://badge.fury.io/py/wikipedia2vec.png)](http://badge.fury.io/py/wikipedia2vec)
[![CircleCI](https://circleci.com/gh/wikipedia2vec/wikipedia2vec.svg?style=svg)](https://circleci.com/gh/wikipedia2vec/wikipedia2vec)
Wikipedia2Vec is a tool used for obtaining embeddings (vector representations) of words and entities from Wikipedia.
It is developed and maintained by [Studio Ousia](http://www.ousia.jp).
This tool enables you to learn embeddings that map words and entities into a unified continuous vector space.
The embeddings can be used as word embeddings, entity embeddings, and the unified embeddings of words and entities.
They are used in the state-of-the-art models of various tasks such as [entity linking](https://arxiv.org/abs/1601.01343), [named entity recognition](http://www.aclweb.org/anthology/I17-2017), [entity relatedness](https://arxiv.org/abs/1601.01343), and [question answering](https://arxiv.org/abs/1803.08652).
Documentation and pretrained embeddings are available online at [http://wikipedia2vec.github.io/](http://wikipedia2vec.github.io/).
Reference
---------
If you use Wikipedia2Vec in a scientific publication, please cite the following paper:
@InProceedings{yamada-EtAl:2016:CoNLL,
author = {Yamada, Ikuya and Shindo, Hiroyuki and Takeda, Hideaki and Takefuji, Yoshiyasu},
title = {Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation},
booktitle = {Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning},
month = {August},
year = {2016},
address = {Berlin, Germany},
pages = {250--259},
publisher = {Association for Computational Linguistics}
}
License
-------
[Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
wikipedia2vec-0.2.5.tar.gz
(1.1 MB
view details)
File details
Details for the file wikipedia2vec-0.2.5.tar.gz
.
File metadata
- Download URL: wikipedia2vec-0.2.5.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.18.4 setuptools/28.8.0 requests-toolbelt/0.8.0 tqdm/4.19.8 CPython/3.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e284da80089d2207fa4c8f3b589e49fb2021e6c080db808f41d50f2ba700a24e |
|
MD5 | 22325ab6c845eca6fe76e61e3c18c6e3 |
|
BLAKE2b-256 | a84679b829e3333880e98e6d602dc7c48bda9cdef664c7a313dd0e87eea779bb |