Skip to main content

No project description provided

Project description

wikibase-sync

License license
Build Status travis build status
Coverage coverage

Python library to synchronise data between RDF files and Wikibase instances.

How to install

You can install the library with pip:

pip install wbsync

Or, alternatively, you can install it manually from the source code:

git clone https://github.com/weso/wikibase-sync
cd wikibase-sync
python setup.py install

Python 3.6+ is recommended.

Examples

With the following code you can synchronize the modification of two RDF files to a given Wikibase instance:

from wbsync.triplestore import WikibaseAdapter
from wbsync.synchronization import GraphDiffSyncAlgorithm, OntologySynchronizer

mediawiki_api_url='wikibase_api_endpoint'
sparql_endpoint_url='wikibase_sparql_endpoint'
username='wikibase_username'
password='wikibase_password'
adapter = WikibaseAdapter(mediawiki_api_url, sparql_endpoint_url, username, password)

algorithm = GraphDiffSyncAlgorithm()
synchronizer = OntologySynchronizer(algorithm)

source_content = "original rdf content goes here"
target_content = "final rdf content goes here"
ops = synchronizer.synchronize(source_content, target_content)
for op in ops:
    res = op.execute(adapter)
    if not res.successful:
        print(f"Error synchronizing triple: {res.message}")

Leaving the source_content empty will be equivalent to adding the target contents to the Wikibase, while leaving the target_content empty will be equivalent to removing the source_content from the Wikibase if present. Additional examples about synchronizing RDF files with a Wikibase instance can be seen in the Synchronization notebook.

Executing batch operations

There is the possibility of performing batch operations (executing at once all of the statements of a given entity). This type of synchronization will have a better performance at the risk that an invalid statement will cancel the entire batch operation. The following code can be used to execute batch operations:

from wbsync.synchronization.operations import optimize_ops

def execute_batch_synchronization(source_content, target_content, synchronizer, adapter):
    ops = synchronizer.synchronize(source_content, target_content)
    batch_ops = optimize_ops(ops)
    for op in batch_ops:
        res = op.execute(adapter)
        if not res.successful:
            print(f"Error synchronizing triple: {res.message}")

More information about these operations and time gained with them can be explored in the Benchmarks notebook.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wbsync-0.1.3.tar.gz (13.1 kB view details)

Uploaded Source

File details

Details for the file wbsync-0.1.3.tar.gz.

File metadata

  • Download URL: wbsync-0.1.3.tar.gz
  • Upload date:
  • Size: 13.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/50.0.0 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.7.7

File hashes

Hashes for wbsync-0.1.3.tar.gz
Algorithm Hash digest
SHA256 59b2f2f561c5c078c1f85996dcbbb6fc4bff2556b52840993f1d64b61cebd471
MD5 7cdfd847515a27f3593ec569131059d5
BLAKE2b-256 765a4089f284813daa2c5dee981747765fcbd880ca972842b5ffcd7405aae205

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page