Skip to main content

Treasure Data API library for Python

Project description

Build Status Build status Coverage Status PyPI version

Treasure Data API library for Python

Requirements

td-client supports the following versions of Python.

  • Python 3.5+

  • PyPy

Install

You can install the releases from PyPI.

$ pip install td-client

It’d be better to install certifi to enable SSL certificate verification.

$ pip install certifi

Examples

Please see also the examples at Treasure Data Documentation.

If you want to find API reference, see also API document.

Listing jobs

Treasure Data API key will be read from environment variable TD_API_KEY, if none is given via apikey= argument passed to tdclient.Client.

Treasure Data API endpoint https://api.treasuredata.com is used by default. You can override this with environment variable TD_API_SERVER, which in turn can be overridden via endpoint= argument passed to tdclient.Client. List of available Treasure Data sites and corresponding API endpoints can be found here.

import tdclient

with tdclient.Client() as td:
    for job in td.jobs():
        print(job.job_id)

Running jobs

Running jobs on Treasure Data.

import tdclient

with tdclient.Client() as td:
    job = td.query("sample_datasets", "SELECT COUNT(1) FROM www_access", type="hive")
    job.wait()
    for row in job.result():
        print(repr(row))

Running jobs via DBAPI2

td-client-python implements PEP 0249 Python Database API v2.0. You can use td-client-python with external libraries which supports Database API such like pandas.

import pandas
import tdclient

def on_waiting(cursor):
    print(cursor.job_status())

with tdclient.connect(db="sample_datasets", type="presto", wait_callback=on_waiting) as td:
    data = pandas.read_sql("SELECT symbol, COUNT(1) AS c FROM nasdaq GROUP BY symbol", td)
    print(repr(data))

We offer another package for pandas named pytd with some advanced features. You may prefer it if you need to do complicated things, such like exporting result data to Treasure Data, printing job’s progress during long execution, etc.

Importing data

Importing data into Treasure Data in streaming manner, as similar as fluentd is doing.

import sys
import tdclient

with tdclient.Client() as td:
    for file_name in sys.argv[:1]:
        td.import_file("mydb", "mytbl", "csv", file_name)

Bulk import

Importing data into Treasure Data in batch manner.

import sys
import tdclient
import time
import warnings

if len(sys.argv) <= 1:
    sys.exit(0)

with tdclient.Client() as td:
    session_name = "session-%d" % (int(time.time()),)
    bulk_import = td.create_bulk_import(session_name, "mydb", "mytbl")
    try:
        for file_name in sys.argv[1:]:
            part_name = "part-%s" % (file_name,)
            bulk_import.upload_file(part_name, "json", file_name)
        bulk_import.freeze()
    except:
        bulk_import.delete()
        raise
    bulk_import.perform(wait=True)
    if 0 < bulk_import.error_records:
        warnings.warn("detected %d error records." % (bulk_import.error_records,))
    if 0 < bulk_import.valid_records:
        print("imported %d records." % (bulk_import.valid_records,))
    else:
        raise(RuntimeError("no records have been imported: %s" % (repr(bulk_import.name),)))
    bulk_import.commit(wait=True)
    bulk_import.delete()

Development

Running tests

Run tests.

$ python setup.py test

Running tests (tox)

You can run tests against all supported Python versions. I’d recommend you to install pyenv to manage Pythons.

$ pyenv shell system
$ for version in $(cat .python-version); do [ -d "$(pyenv root)/versions/${version}" ] || pyenv install "${version}"; done
$ pyenv shell --unset

Install tox.

$ pip install tox

Then, run tox.

$ tox

Release

Release to PyPI. Ensure you installed twine.

$ python setup.py bdist_wheel sdist
$ twine upload dist/*

License

Apache Software License, Version 2.0

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

td-client-1.0.1.tar.gz (55.8 kB view details)

Uploaded Source

Built Distribution

td_client-1.0.1-py3-none-any.whl (79.8 kB view details)

Uploaded Python 3

File details

Details for the file td-client-1.0.1.tar.gz.

File metadata

  • Download URL: td-client-1.0.1.tar.gz
  • Upload date:
  • Size: 55.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.6

File hashes

Hashes for td-client-1.0.1.tar.gz
Algorithm Hash digest
SHA256 cc3e037b928c44c46b17c5177edd3ba29f63a752b5114afd7662c406148b6e70
MD5 acc73895cd69d1eb0b86a8bc254d63e7
BLAKE2b-256 9da00d73badc0331fb31e37c33a76f83d14d9eed293390b9c327cf09a48a3e3c

See more details on using hashes here.

File details

Details for the file td_client-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: td_client-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 79.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.6

File hashes

Hashes for td_client-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1749382b8275fa87fefcad7d73b9c986abc8fc2fbda2e8c07abfb2836a5b42ff
MD5 0b5a48216e3faba31b12380abacf1439
BLAKE2b-256 3cabc4b6c29e9b746d1223712aaa756f28257745a3e638f2b15978951fefd282

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page