Treasure Data API library for Python
Project description
# Treasure Data API library for Python
[![Build Status](https://travis-ci.org/treasure-data/td-client-python.svg)](https://travis-ci.org/treasure-data/td-client-python)
[![Build status](https://ci.appveyor.com/api/projects/status/eol91l1ag50xee9m/branch/master?svg=true)](https://ci.appveyor.com/project/treasure-data/td-client-python/branch/master)
[![Coverage Status](https://coveralls.io/repos/treasure-data/td-client-python/badge.svg)](https://coveralls.io/r/treasure-data/td-client-python)
[![Code Health](https://landscape.io/github/treasure-data/td-client-python/master/landscape.svg?style=flat)](https://landscape.io/github/treasure-data/td-client-python/master)
[![PyPI version](https://badge.fury.io/py/td-client.svg)](http://badge.fury.io/py/td-client)
Treasure Data API library for Python
## Requirements
`td-client` supports the following versions of Python.
* Python 2.7+
* Python 3.3+
* PyPy
## Install
You can install the releases from [PyPI](https://pypi-hypernode.com/).
```sh
$ pip install td-client
```
It'd be better to install [certifi](https://pypi-hypernode.com/pypi/certifi) to enable SSL certificate verification.
```sh
$ pip install certifi
```
## Examples
Please see also the examples at [Treasure Data Documentation](http://docs.treasuredata.com/articles/rest-api-python-client).
### Listing jobs
Treasure Data API key will be read from environment variable `TD_API_KEY`, if none is given via `apikey=` argument passed to `tdclient.Client`.
Treasure Data API endpoint `https://api.treasuredata.com` is used by default. You can override this with environment variable `TD_API_SERVER`, which in turn can be overridden via `endpoint=` argument passed to `tdclient.Client`. List of available Treasure Data sites and corresponding API endpoints can be found [here](https://support.treasuredata.com/hc/en-us/articles/360001474288-Sites-and-Endpoints).
```python
import tdclient
with tdclient.Client() as td:
for job in td.jobs():
print(job.job_id)
```
### Running jobs
Running jobs on Treasure Data.
```python
import tdclient
with tdclient.Client() as td:
job = td.query("sample_datasets", "SELECT COUNT(1) FROM www_access", type="hive")
job.wait()
for row in job.result():
print(repr(row))
```
### Running jobs via DBAPI2
td-client-python implements [PEP 0249](https://www.python.org/dev/peps/pep-0249/) Python Database API v2.0.
You can use td-client-python with external libraries which supports Database API such like [pandas](http://pandas.pydata.org/).
```python
import pandas
import tdclient
def on_waiting(cursor):
print(cursor.job_status())
with tdclient.connect(db="sample_datasets", type="presto", wait_callback=on_waiting) as td:
data = pandas.read_sql("SELECT symbol, COUNT(1) AS c FROM nasdaq GROUP BY symbol", td)
print(repr(data))
```
We offer another package for pandas named [pandas-td](https://github.com/treasure-data/pandas-td) with some advanced features.
You may prefer it if you need to do complicated things, such like exporting result data to Treasure Data, printing job's
progress during long execution, etc.
### Importing data
Importing data into Treasure Data in streaming manner, as similar as [fluentd](http://www.fluentd.org/) is doing.
```python
import sys
import tdclient
with tdclient.Client() as td:
for file_name in sys.argv[:1]:
td.import_file("mydb", "mytbl", "csv", file_name)
```
### Bulk import
Importing data into Treasure Data in batch manner.
```python
from __future__ import print_function
import sys
import tdclient
import time
import warnings
if len(sys.argv) <= 1:
sys.exit(0)
with tdclient.Client() as td:
session_name = "session-%d" % (int(time.time()),)
bulk_import = td.create_bulk_import(session_name, "mydb", "mytbl")
try:
for file_name in sys.argv[1:]:
part_name = "part-%s" % (file_name,)
bulk_import.upload_file(part_name, "json", file_name)
bulk_import.freeze()
except:
bulk_import.delete()
raise
bulk_import.perform(wait=True)
if 0 < bulk_import.error_records:
warnings.warn("detected %d error records." % (bulk_import.error_records,))
if 0 < bulk_import.valid_records:
print("imported %d records." % (bulk_import.valid_records,))
else:
raise(RuntimeError("no records have been imported: %s" % (repr(bulk_import.name),)))
bulk_import.commit(wait=True)
bulk_import.delete()
```
## Development
### Running tests
Run tests.
```sh
$ python setup.py test
```
### Running tests (tox)
You can run tests against all supported Python versions. I'd recommend you to install [pyenv](https://github.com/yyuu/pyenv) to manage Pythons.
```sh
$ pyenv shell system
$ for version in $(cat .python-version); do [ -d "$(pyenv root)/versions/${version}" ] || pyenv install "${version}"; done
$ pyenv shell --unset
```
Install [tox](https://pypi-hypernode.com/pypi/tox).
```sh
$ pip install tox
```
Then, run `tox`.
```sh
$ tox
```
### Release
Release to PyPI.
```sh
$ python setup.py bdist_wheel --universal sdist upload
```
## Version History
See [CHANGELOG.md](CHANGELOG.md).
## License
Apache Software License, Version 2.0
[![Build Status](https://travis-ci.org/treasure-data/td-client-python.svg)](https://travis-ci.org/treasure-data/td-client-python)
[![Build status](https://ci.appveyor.com/api/projects/status/eol91l1ag50xee9m/branch/master?svg=true)](https://ci.appveyor.com/project/treasure-data/td-client-python/branch/master)
[![Coverage Status](https://coveralls.io/repos/treasure-data/td-client-python/badge.svg)](https://coveralls.io/r/treasure-data/td-client-python)
[![Code Health](https://landscape.io/github/treasure-data/td-client-python/master/landscape.svg?style=flat)](https://landscape.io/github/treasure-data/td-client-python/master)
[![PyPI version](https://badge.fury.io/py/td-client.svg)](http://badge.fury.io/py/td-client)
Treasure Data API library for Python
## Requirements
`td-client` supports the following versions of Python.
* Python 2.7+
* Python 3.3+
* PyPy
## Install
You can install the releases from [PyPI](https://pypi-hypernode.com/).
```sh
$ pip install td-client
```
It'd be better to install [certifi](https://pypi-hypernode.com/pypi/certifi) to enable SSL certificate verification.
```sh
$ pip install certifi
```
## Examples
Please see also the examples at [Treasure Data Documentation](http://docs.treasuredata.com/articles/rest-api-python-client).
### Listing jobs
Treasure Data API key will be read from environment variable `TD_API_KEY`, if none is given via `apikey=` argument passed to `tdclient.Client`.
Treasure Data API endpoint `https://api.treasuredata.com` is used by default. You can override this with environment variable `TD_API_SERVER`, which in turn can be overridden via `endpoint=` argument passed to `tdclient.Client`. List of available Treasure Data sites and corresponding API endpoints can be found [here](https://support.treasuredata.com/hc/en-us/articles/360001474288-Sites-and-Endpoints).
```python
import tdclient
with tdclient.Client() as td:
for job in td.jobs():
print(job.job_id)
```
### Running jobs
Running jobs on Treasure Data.
```python
import tdclient
with tdclient.Client() as td:
job = td.query("sample_datasets", "SELECT COUNT(1) FROM www_access", type="hive")
job.wait()
for row in job.result():
print(repr(row))
```
### Running jobs via DBAPI2
td-client-python implements [PEP 0249](https://www.python.org/dev/peps/pep-0249/) Python Database API v2.0.
You can use td-client-python with external libraries which supports Database API such like [pandas](http://pandas.pydata.org/).
```python
import pandas
import tdclient
def on_waiting(cursor):
print(cursor.job_status())
with tdclient.connect(db="sample_datasets", type="presto", wait_callback=on_waiting) as td:
data = pandas.read_sql("SELECT symbol, COUNT(1) AS c FROM nasdaq GROUP BY symbol", td)
print(repr(data))
```
We offer another package for pandas named [pandas-td](https://github.com/treasure-data/pandas-td) with some advanced features.
You may prefer it if you need to do complicated things, such like exporting result data to Treasure Data, printing job's
progress during long execution, etc.
### Importing data
Importing data into Treasure Data in streaming manner, as similar as [fluentd](http://www.fluentd.org/) is doing.
```python
import sys
import tdclient
with tdclient.Client() as td:
for file_name in sys.argv[:1]:
td.import_file("mydb", "mytbl", "csv", file_name)
```
### Bulk import
Importing data into Treasure Data in batch manner.
```python
from __future__ import print_function
import sys
import tdclient
import time
import warnings
if len(sys.argv) <= 1:
sys.exit(0)
with tdclient.Client() as td:
session_name = "session-%d" % (int(time.time()),)
bulk_import = td.create_bulk_import(session_name, "mydb", "mytbl")
try:
for file_name in sys.argv[1:]:
part_name = "part-%s" % (file_name,)
bulk_import.upload_file(part_name, "json", file_name)
bulk_import.freeze()
except:
bulk_import.delete()
raise
bulk_import.perform(wait=True)
if 0 < bulk_import.error_records:
warnings.warn("detected %d error records." % (bulk_import.error_records,))
if 0 < bulk_import.valid_records:
print("imported %d records." % (bulk_import.valid_records,))
else:
raise(RuntimeError("no records have been imported: %s" % (repr(bulk_import.name),)))
bulk_import.commit(wait=True)
bulk_import.delete()
```
## Development
### Running tests
Run tests.
```sh
$ python setup.py test
```
### Running tests (tox)
You can run tests against all supported Python versions. I'd recommend you to install [pyenv](https://github.com/yyuu/pyenv) to manage Pythons.
```sh
$ pyenv shell system
$ for version in $(cat .python-version); do [ -d "$(pyenv root)/versions/${version}" ] || pyenv install "${version}"; done
$ pyenv shell --unset
```
Install [tox](https://pypi-hypernode.com/pypi/tox).
```sh
$ pip install tox
```
Then, run `tox`.
```sh
$ tox
```
### Release
Release to PyPI.
```sh
$ python setup.py bdist_wheel --universal sdist upload
```
## Version History
See [CHANGELOG.md](CHANGELOG.md).
## License
Apache Software License, Version 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
td-client-0.13.0.tar.gz
(47.6 kB
view details)
Built Distribution
File details
Details for the file td-client-0.13.0.tar.gz
.
File metadata
- Download URL: td-client-0.13.0.tar.gz
- Upload date:
- Size: 47.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: Python-urllib/3.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3fbc4d75f4ba3b82f1b1c5f832ed55b8e5f84fbe6430e62599805bb0fe83062b |
|
MD5 | d3ea462a20215263e9af82ad3df9c2f8 |
|
BLAKE2b-256 | 429ea31d5cf04241710034ad4ba27961e6c720c290af2d492877142535d9bda4 |
File details
Details for the file td_client-0.13.0-py2.py3-none-any.whl
.
File metadata
- Download URL: td_client-0.13.0-py2.py3-none-any.whl
- Upload date:
- Size: 78.4 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: Python-urllib/3.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5f5bec092da1bfd7f7f76faca2ac7b430860b3969e078021730e29c23720cafa |
|
MD5 | 98ad316e56d03eb221abd237526bc47d |
|
BLAKE2b-256 | 7b52e64c98e0a3740ae46e0e0b535c80298a358c0143c670a80a775020489679 |