Treasure Data API library for Python
Project description
Treasure Data API library for Python
Requirements
td-client supports the following versions of Python.
Python 3.5+
PyPy
Install
You can install the releases from PyPI.
$ pip install td-client
It’d be better to install certifi to enable SSL certificate verification.
$ pip install certifi
Examples
Please see also the examples at Treasure Data Documentation.
If you want to find API reference, see also API document.
Listing jobs
Treasure Data API key will be read from environment variable TD_API_KEY, if none is given via apikey= argument passed to tdclient.Client.
Treasure Data API endpoint https://api.treasuredata.com is used by default. You can override this with environment variable TD_API_SERVER, which in turn can be overridden via endpoint= argument passed to tdclient.Client. List of available Treasure Data sites and corresponding API endpoints can be found here.
import tdclient
with tdclient.Client() as td:
for job in td.jobs():
print(job.job_id)
Running jobs
Running jobs on Treasure Data.
import tdclient
with tdclient.Client() as td:
job = td.query("sample_datasets", "SELECT COUNT(1) FROM www_access", type="hive")
job.wait()
for row in job.result():
print(repr(row))
Running jobs via DBAPI2
td-client-python implements PEP 0249 Python Database API v2.0. You can use td-client-python with external libraries which supports Database API such like pandas.
import pandas
import tdclient
def on_waiting(cursor):
print(cursor.job_status())
with tdclient.connect(db="sample_datasets", type="presto", wait_callback=on_waiting) as td:
data = pandas.read_sql("SELECT symbol, COUNT(1) AS c FROM nasdaq GROUP BY symbol", td)
print(repr(data))
We offer another package for pandas named pytd with some advanced features. You may prefer it if you need to do complicated things, such like exporting result data to Treasure Data, printing job’s progress during long execution, etc.
Importing data
Importing data into Treasure Data in streaming manner, as similar as fluentd is doing.
import sys
import tdclient
with tdclient.Client() as td:
for file_name in sys.argv[:1]:
td.import_file("mydb", "mytbl", "csv", file_name)
Bulk import
Importing data into Treasure Data in batch manner.
import sys
import tdclient
import time
import warnings
if len(sys.argv) <= 1:
sys.exit(0)
with tdclient.Client() as td:
session_name = "session-%d" % (int(time.time()),)
bulk_import = td.create_bulk_import(session_name, "mydb", "mytbl")
try:
for file_name in sys.argv[1:]:
part_name = "part-%s" % (file_name,)
bulk_import.upload_file(part_name, "json", file_name)
bulk_import.freeze()
except:
bulk_import.delete()
raise
bulk_import.perform(wait=True)
if 0 < bulk_import.error_records:
warnings.warn("detected %d error records." % (bulk_import.error_records,))
if 0 < bulk_import.valid_records:
print("imported %d records." % (bulk_import.valid_records,))
else:
raise(RuntimeError("no records have been imported: %s" % (repr(bulk_import.name),)))
bulk_import.commit(wait=True)
bulk_import.delete()
Development
Running tests
Run tests.
$ python setup.py test
Running tests (tox)
You can run tests against all supported Python versions. I’d recommend you to install pyenv to manage Pythons.
$ pyenv shell system
$ for version in $(cat .python-version); do [ -d "$(pyenv root)/versions/${version}" ] || pyenv install "${version}"; done
$ pyenv shell --unset
Install tox.
$ pip install tox
Then, run tox.
$ tox
Release
Release to PyPI.
$ python setup.py bdist_wheel --universal sdist upload
License
Apache Software License, Version 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file td-client-1.0.0.tar.gz
.
File metadata
- Download URL: td-client-1.0.0.tar.gz
- Upload date:
- Size: 54.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 48842cb4b22dd7fece4e1d2593985b9a151796cfff504bf2de24d04dafbea1ab |
|
MD5 | d9e5ab9374e3055e3bba981d437fbfc9 |
|
BLAKE2b-256 | 729bd71ebcce1ce1e99f6ad6ad5c935d26f28d836093a62ca07cd7cd137835d9 |
File details
Details for the file td_client-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: td_client-1.0.0-py3-none-any.whl
- Upload date:
- Size: 78.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 017dc6d21eb44e8ef5f3a9e0e313a37760e83dc0df3684b49d29cb73fdac89d6 |
|
MD5 | 57a54a33dbabd58d021aa3cef796e1d0 |
|
BLAKE2b-256 | 3adbc09fd11e802168966bbfa64bb856ae08751996cc82fa762d746b39faf3d3 |