Skip to main content

AHL Research Versioned TimeSeries and Tick store

Project description

# [![arctic](logo/arctic_50.png)](https://github.com/manahl/arctic) [Arctic TimeSeries and Tick store](https://github.com/manahl/arctic)


[![Circle CI](https://circleci.com/gh/manahl/arctic.svg?style=shield)](https://circleci.com/gh/manahl/arctic)
[![Coverage Status](https://coveralls.io/repos/github/manahl/arctic/badge.svg?branch=master)](https://coveralls.io/github/manahl/arctic?branch=master)
[![Join the chat at https://gitter.im/manahl/arctic](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/manahl/arctic?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)

Arctic is a high performance datastore for numeric data. It supports [Pandas](http://pandas.pydata.org/),
[numpy](http://www.numpy.org/) arrays and pickled objects out-of-the-box, with pluggable support for
other data types and optional versioning.

Arctic can query millions of rows per second per client, achieves ~10x compression on network bandwidth,
~10x compression on disk, and scales to hundreds of millions of rows per second per
[MongoDB](https://www.mongodb.org/) instance.

Arctic has been under active development at [Man AHL](http://www.ahl.com/) since 2012.

## Quickstart

### Install Arctic

```
pip install git+https://github.com/manahl/arctic.git
```

### Run a MongoDB

```
mongod --dbpath <path/to/db_directory>
```

### Using VersionStore

```
from arctic import Arctic

# Connect to Local MONGODB
store = Arctic('localhost')

# Create the library - defaults to VersionStore
store.initialize_library('NASDAQ')

# Access the library
library = store['NASDAQ']

# Load some data - maybe from Quandl
aapl = Quandl.get("NASDAQ/AAPL", authtoken="your token here")

# Store the data in the library
library.write('AAPL', aapl, metadata={'source': 'Quandl'})

# Reading the data
item = library.read('AAPL')
aapl = item.data
metadata = item.metadata
```

VersionStore supports much more: [See the HowTo](howtos/how_to_use_arctic.py)!


### Adding your own storage engine

Plugging a custom class in as a library type is straightforward. [This example
shows how.](howtos/how_to_custom_arctic_library.py)



## Concepts

### Libraries

Arctic provides namespaced *libraries* of data. These libraries allow
bucketing data by *source*, *user* or some other metric (for example frequency:
End-Of-Day; Minute Bars; etc.).

Arctic supports multiple data libraries per user. A user (or namespace)
maps to a MongoDB database (the granularity of mongo authentication). The library
itself is composed of a number of collections within the database. Libraries look like:

* user.EOD
* user.ONEMINUTE

A library is mapped to a Python class. All library databases in MongoDB are prefixed with 'arctic_'

### Storage Engines

Arctic includes two storage engines:

* [VersionStore](arctic/store/version_store.py): a key-value versioned TimeSeries store. It supports:
* Pandas data types (other Python types pickled)
* Multiple versions of each data item. Can easily read previous versions.
* Create point-in-time snapshots across symbols in a library
* Soft quota support
* Hooks for persisting other data types
* Audited writes: API for saving metadata and data before and after a write.
* a wide range of TimeSeries data frequencies: End-Of-Day to Minute bars
* [See the HowTo](howtos/how_to_use_arctic.py)
* [TickStore](arctic/tickstore/tickstore.py): Column oriented tick database. Supports
dynamic fields, chunks aren't versioned. Designed for large continuously ticking data.

Arctic storage implementations are **pluggable**. VersionStore is the default.


## Requirements

Arctic currently works with:

* Python 2.7
* pymongo >= 3.0
* Pandas
* MongoDB >= 2.4.x


## Acknowledgements

Arctic has been under active development at [Man AHL](http://www.ahl.com/) since 2012.

It wouldn't be possible without the work of the AHL Data Engineering Team including:

* [Richard Bounds](https://github.com/richardbounds)
* [James Blackburn](https://github.com/jamesblackburn)
* [Vlad Mereuta](https://github.com/vmereuta)
* [Tom Taylor](https://github.com/TomTaylorLondon)
* Tope Olukemi
* Drake Siard
* [Slavi Marinov](https://github.com/slavi)
* [Wilfred Hughes](https://github.com/wilfred)
* [Edward Easton](https://github.com/eeaston)
* ... and many others ...

Contributions welcome!

## License

Arctic is licensed under the GNU LGPL v2.1. A copy of which is included in [LICENSE](LICENSE)



## Changelog

### 1.12 (2015-11-12)

* Bugfix: correct version detection for Pandas >= 0.18.
* Bugfix: retrying connection initialisation in case of an AutoReconnect failure.

### 1.11 (2015-10-29)

* Bugfix: Improve performance of saving multi-index Pandas DataFrames
by 9x
* Bugfix: authenticate should propagate non-OperationFailure exceptions
(e.g. ConnectionFailure) as this might be indicative of socket failures
* Bugfix: return 'deleted' state in VersionStore.list_versions() so that
callers can pick up on the head version being the delete-sentinel.

### 1.10 (2015-10-28)

* Bugfix: VersionStore.read(date_range=...) could do the wrong thing with
TimeZones (which aren't yet supported for date_range slicing.).

### 1.9 (2015-10-06)

* Bugfix: fix authentication race condition when sharing an Arctic
instance between multiple threads.

### 1.8 (2015-09-29)

* Bugfix: compatibility with both 3.0 and pre-3.0 MongoDB for
querying current authentications

### 1.7 (2015-09-18)

* Feature: Add support for reading a subset of a pandas DataFrame
in VersionStore.read by passing in an arctic.date.DateRange
* Bugfix: Reauth against admin if not auth'd against a library a
specific library's DB. Sometimes we appear to miss admin DB auths.
This is to workaround that until we work out what the issue is.

### 1.6 (2015-09-16)

* Feature: Add support for multi-index Bitemporal DataFrame storage.
This allows persisting data and changes within the DataFrame making it
easier to see how old data has been revised over time.
* Bugfix: Ensure we call the error logging hook when exceptions occur

### 1.5 (2015-09-02)

* Always use the primary cluster node for 'has_symbol()', it's safer

### 1.4 (2015-08-19)

* Bugfixes for timezone handling, now ensures use of non-naive datetimes
* Bugfix for tickstore read missing images

### 1.3 (2015-08-011)

* Improvements to command-line control scripts for users and libraries
* Bugfix for pickling top-level Arctic object

### 1.2 (2015-06-29)

* Allow snapshotting a range of versions in the VersionStore, and
snapshot all versions by default.

### 1.1 (2015-06-16)

* Bugfix for backwards-compatible unpickling of bson-encoded data
* Added switch for enabling parallel lz4 compression

### 1.0 (2015-06-14)

* Initial public release

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arctic-1.12.0.tar.gz (215.7 kB view details)

Uploaded Source

Built Distribution

arctic-1.12.0-py2.7-linux-x86_64.egg (362.7 kB view details)

Uploaded Source

File details

Details for the file arctic-1.12.0.tar.gz.

File metadata

  • Download URL: arctic-1.12.0.tar.gz
  • Upload date:
  • Size: 215.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for arctic-1.12.0.tar.gz
Algorithm Hash digest
SHA256 8f91736b540c948b38f2e5df8b2f43a6aa661a0ab759865f78e60417fe87bcea
MD5 d39d4aba467db7052f68655b2512a27d
BLAKE2b-256 9aa4d551b93fbb00b9f2ab87e943c8faa4bf4f171f0c209380143b5f1b77fe80

See more details on using hashes here.

Provenance

File details

Details for the file arctic-1.12.0-py2.7-linux-x86_64.egg.

File metadata

File hashes

Hashes for arctic-1.12.0-py2.7-linux-x86_64.egg
Algorithm Hash digest
SHA256 e53b09507274c572c4f97543635a6cfc28cf731cfc08b72ae27e1697354e050b
MD5 4da325628943a8b26993a7e6f084f064
BLAKE2b-256 2788c9a5e934408683867dbc2b6e3029ab0dec4b890001d82be13e12a4864b78

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page