Fuzzy matching utilities for scholarly metadata
Project description
fcfuzzy
Fuzzy matching publications for fatcat.
Motivation
Most of the results on sites like Google Scholar group publications into clusters. Each cluster represents one publication, abstracted from its concrete representation as a link to a PDF.
We call the abstract publication work and the concrete instance a release. The goal is to group releases under works and to implement a versions feature.
This repository contains both generic code for matching as well as fatcat specific code using the fatcat openapi client.
Dataset
Release metadata from: https://archive.org/details/fatcat_bulk_exports_2020-08-05.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file fuzzycat-0.1.0.tar.gz
.
File metadata
- Download URL: fuzzycat-0.1.0.tar.gz
- Upload date:
- Size: 1.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95e01c48b23825a994d4e7c5fa4037bf95976b324eb2f513817e05280030b27c |
|
MD5 | 7270c62f5fe3a8e89dc97ce55bf4beb2 |
|
BLAKE2b-256 | 1aa4ef0fabebd1fcbb7b7e4a5cd213bce7e9976d2689c5b48db46bbd15e7a622 |
File details
Details for the file fuzzycat-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: fuzzycat-0.1.0-py3-none-any.whl
- Upload date:
- Size: 3.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad503d9d7a5eb9dffdce8848ede06549528f376bbaa54fa6c78d8b10abc61e3a |
|
MD5 | 2d7718293622ffa4dc78fd21c4904f41 |
|
BLAKE2b-256 | 03f5d8aaa30a3625b764219e86c09a22106435190d2aa8c12c10fc255a8f0cc4 |