Skip to main content

No project description provided

Project description

wikibase-sync

License license
Build Status travis build status
Coverage coverage

Python library to synchronise data between RDF files and Wikibase instances.

How to install

You can install the library with pip:

pip install wbsync

Or, alternatively, you can install it manually from the source code:

git clone https://github.com/weso/wikibase-sync
cd wikibase-sync
python setup.py install

Python 3.6+ is recommended.

Examples

With the following code you can synchronize the modification of two RDF files to a given Wikibase instance:

from wbsync.triplestore import WikibaseAdapter
from wbsync.synchronization import GraphDiffSyncAlgorithm, OntologySynchronizer

mediawiki_api_url='wikibase_api_endpoint'
sparql_endpoint_url='wikibase_sparql_endpoint'
username='wikibase_username'
password='wikibase_password'
adapter = WikibaseAdapter(mediawiki_api_url, sparql_endpoint_url, username, password)

algorithm = GraphDiffSyncAlgorithm()
synchronizer = OntologySynchronizer(algorithm)

source_content = "original rdf content goes here"
target_content = "final rdf content goes here"
ops = synchronizer.synchronize(source_content, target_content)
for op in ops:
    res = op.execute(adapter)
    if not res.successful:
        print(f"Error synchronizing triple: {res.message}")

Leaving the source_content empty will be equivalent to adding the target contents to the Wikibase, while leaving the target_content empty will be equivalent to removing the source_content from the Wikibase if present. Additional examples about synchronizing RDF files with a Wikibase instance can be seen in the Synchronization notebook.

Executing batch operations

There is the possibility of performing batch operations (executing at once all of the statements of a given entity). This type of synchronization will have a better performance at the risk that an invalid statement will cancel the entire batch operation. The following code can be used to execute batch operations:

from wbsync.synchronization.operations import optimize_ops

def execute_batch_synchronization(source_content, target_content, synchronizer, adapter):
    ops = synchronizer.synchronize(source_content, target_content)
    batch_ops = optimize_ops(ops)
    for op in batch_ops:
        res = op.execute(adapter)
        if not res.successful:
            print(f"Error synchronizing triple: {res.message}")

More information about these operations and time gained with them can be explored in the Benchmarks notebook.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wbsync-0.1.2.tar.gz (13.1 kB view details)

Uploaded Source

File details

Details for the file wbsync-0.1.2.tar.gz.

File metadata

  • Download URL: wbsync-0.1.2.tar.gz
  • Upload date:
  • Size: 13.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/50.0.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.7

File hashes

Hashes for wbsync-0.1.2.tar.gz
Algorithm Hash digest
SHA256 ef25f77b911bd321d3bcb46901cfd91e1718ec20cbc77dc14e1e671edc0803e8
MD5 8edeec3de90a84931f0ecc6b86f96781
BLAKE2b-256 49ebcdbd7efdfa1cd621e4b0f7e04f94a6e8d5c6f10db7a77cb04adaf16da781

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page