A tool for learning vector representations of words and entities from Wikipedia
Project description
Wikipedia2Vec is a tool used for obtaining embeddings (vector representations) of words and entities from Wikipedia. It is developed and maintained by Studio Ousia.
This tool enables you to learn embeddings that map words and entities into a unified continuous vector space. The embeddings can be used as word embeddings, entity embeddings, and the unified embeddings of words and entities. They are used in the state-of-the-art models of various tasks such as entity linking, named entity recognition, entity relatedness, and question answering.
Documentation and pretrained embeddings are available online at http://wikipedia2vec.github.io/.
Reference
If you use Wikipedia2Vec in a scientific publication, please cite the following paper:
@InProceedings{yamada-EtAl:2016:CoNLL, author = {Yamada, Ikuya and Shindo, Hiroyuki and Takeda, Hideaki and Takefuji, Yoshiyasu}, title = {Joint Learning of the Embedding of Words and Entities for Named Entity Disambiguation}, booktitle = {Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning}, month = {August}, year = {2016}, address = {Berlin, Germany}, pages = {250--259}, publisher = {Association for Computational Linguistics} }
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file wikipedia2vec-0.2.4_3.tar.gz
.
File metadata
- Download URL: wikipedia2vec-0.2.4_3.tar.gz
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 07a514d7487686df27ccf718a35dd9ebd2dc214f32231ff0f0fba712ed290680 |
|
MD5 | 8b5d609626507d3ccd488886867c0f64 |
|
BLAKE2b-256 | 55b97cd9045410391f82b0e8529ff9fa3a638f1511bfe57b2443498bf148ed67 |