Skip to main content

A command-line tool for using the Anon AI web service.

Project description

Anon AI Toolbelt
================

The Anon AI Toolbelt is a command line interface (CLI) tool for managing
and anonymising data with the `Anon AI web service <https://anon.ai>`__.
It's developed in Python and the code is published under the `MIT
License <https://github.com/anon-ai/toolbelt/blob/master/LICENSE>`__ at
`github.com/anon-ai/toolbelt <https://github.com/anon-ai/toolbelt>`__.

*Caution: the toolbelt is under active development. Core functionality
works but commands and options are liable to change and some of the
features that are documented don't yet exist.*

Installation
------------

Install using ``pip`` into a Python3 environment:

.. code:: bash

pip install anon-ai-toolbelt

Note that the toolbelt only works with Python3 and installs dependencies
including the `Python Cryptography
Toolkit <https://pypi-hypernode.com/pypi/pycrypto>`__.

Usage
-----

The primary workflow is for a data controller to ``push`` data into the
system and then for data processors to ``pull`` the data down in
anonymised form.

- `anon login <#login>`__
- `anon push INPUT\_FILE RESOURCE <#push>`__
- `anon pull RESOURCE OUTPUT\_FILE <#pull>`__
- `anon pipe URL OUTPUT\_FILE <#pipe>`__

.. raw:: html

<!--

- [anon locate RESOURCE](#locate)
- [anon analyse RESOURCE](#analyse)
- [anon inspect RESOURCE](#inspect)
-->

Login
~~~~~

Login with your API credentials (writes to
``~/.config/anon.ai/config.json``):

.. code:: bash

anon login
> key: ...
> secret: ...

Push
~~~~

Push a data snapshot up to ingest and store it.

.. code:: bash

anon push foo.dump mydb

When ingesting structured data you should specify the data format:

.. code:: bash

anon push foo.dump mydb --format postgres

In this example, ``mydb`` is an arbitrary resource name that you use to
identify this ingested data source. Subsequent pushes to the same name
are usually used to store a new snapshot of the same file or database.

The stored data is encrypted using AES-256 with a per-account encryption
key that lives in (and never leaves) a `secure
vault <https://www.vaultproject.io/>`__. You can also optionally provide
your own encryption key:

.. code:: bash

anon push foo.dump mydb --encryption-key LONG_RANDOM_STRING

Note that:

1. your encryption key is **never persisted** in our system -- so you
have to manage it and give it to any users that you want to share
anonymised data with
2. there's no strict requirement on length or format for your encryption
key value (we SHA-256 hash it along with your per-account encryption
key) but we recommend at least 16 bytes entropy

Pull
~~~~

Pull down an anonymised copy of an ingested data snapshot:

.. code:: bash

anon pull mydb foo.dump

Optionally provide an encryption key (to decrypt the stored data with)
and / or configure how you'd like it anonymised:

.. code:: bash

anon pull mydb foo.dump --config config.json --encryption-key ...

Pipe
~~~~

Pipe data through to anonymise it:

.. code:: bash

anon pipe http://humanstxt.org/humans.txt /tmp/humans.anon.txt

This parses, analyses and anonymises the data on the fly, i.e.: without
persisting it. The data source must currently be a URL.

.. raw:: html

<!--

### Locate

As an alternative to pulling down the data locally, you can get a temporary download URL:

```bash
anon locate mydb
```

This writes a temporary url to stdout. As with `pull`, you can optionally specify an encryption key and configure anonymisation:

```bash
anon locate mydb --config config.json --encryption-key ...
```

You can also control the timeout duration for the URL. This defaults to 30 minutes and can be a maximum of 24 hours:

```bash
anon locate mydb --timeout 2h
```

### Analyse

Analyse a snapshot to get our structural analysis of the data:

```bash
anon analyse mydb > analysis.json
```

### Inspect

Inspect a resource name to list the versions and see its status:

```bash
anon inspect mydb
```

-->

Versions
~~~~~~~~

You can ``pull`` specific snapshot versions by targeting them by name:

.. code:: bash

anon pull mydb --snapshot someid

You can also ``push`` snapshots up with a specific name:

.. code:: bash

anon push foo.sql mydb --snapshot someid

Tab completion
~~~~~~~~~~~~~~

Enable ``bash`` completion by adding the following to your ``.bashrc``:

.. code:: bash

eval "$(_ANON_COMPLETE=source anon)"

If you use ``zsh``, you can emulate bash completion by first adding
``bashcompinit`` to your ``.zshrc``:

.. code:: bash

autoload bashcompinit
bashcompinit
eval "$(_ANON_COMPLETE=source anon)"

For more information see `Anon AI <https://anon.ai>`__.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

anon-ai-toolbelt-0.2.9.tar.gz (9.0 kB view details)

Uploaded Source

File details

Details for the file anon-ai-toolbelt-0.2.9.tar.gz.

File metadata

File hashes

Hashes for anon-ai-toolbelt-0.2.9.tar.gz
Algorithm Hash digest
SHA256 14b516919a17d25e33ba279c723b20c97a68be5ecefdf630365cd4c4e5df60e5
MD5 b90991bfe6b0c4839f746d5219265f68
BLAKE2b-256 d2b685928f5a8b3e0aa9accf4beae1d49ec1a1c24ed4317fca5df36739f0f772

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page