Skip to main content

Anonymity library for python

Project description

PYTHON LIBRARY FOR ANONYMIZATION

This library supports the application of three classical anonymization techniques for tabular data: k-anonymity, l-diversity and t-closeness.

Installation

We recommend to use Python3 with virtualenv:

> virtualenv .venv -p python3
> source .venv/bin/activate

Then run the following command to install the library and all its requirements:

pip install python-anonymity

Documentation

The python-anonymity documentation is hosted on Read the Docs.

Getting started

Example using the crime synthetic dataset:

> import pandas as pd
> import pycanon
> from anonymity import tools
> from anonymity.tools.utils_k_anon import utils_k_anonymity as utils
> 
> d = {
>         "name": ["Joe", "Jill", "Sue", "Abe", "Bob", "Amy"],
>         "marital stat": [
>             "Separated",
>             "Single",
>             "Widowed",
>             "Separated",
>             "Widowed",
>             "Single",
>         ],
>         "age": [29, 20, 24, 28, 25, 23],
>         "ZIP code": ["32042", "32021", "32024", "32046", "32045", "32027"],
>         "crime": ["Murder", "Theft", "Traffic", "Assault", "Piracy", "Indecency"],
>     }
>     data = pd.DataFrame(data=d)
> 
>     ID = ["name"]
>     QI = ["marital stat", "age", "ZIP code"]
>     SA = ["crime"]
>     age_hierarchy = {"age": [0, 2, 5, 10]}
>     hierarchy = {
>         "marital stat": [
>             ["Single", "Not married", "*"],
>             ["Separated", "Not married", "*"],
>             ["Divorce", "Not married", "*"],
>             ["Widowed", "Not married", "*"],
>             ["Married", "Married", "*"],
>             ["Re-married", "Married", "*"],
>         ],
>         "ZIP code": [
>             ["32042", "3204*", "*"],
>             ["32021", "3202*", "*"],
>             ["32024", "3202*", "*"],
>             ["32046", "3204*", "*"],
>             ["32045", "3204*", "*"],
>             ["32027", "3202*", "*"],
>         ],
>     }
> 
>     mix_hierarchy = dict(hierarchy, **utils.create_ranges(data, age_hierarchy))

>     k = 2
>     supp_threshold = 0
>     new_data = tools.data_fly(data, ID, QI, k, supp_threshold, self.mix_hierarchy)
> 

License: Apache 2.0.

Note: the library is under heavy production, only for testing purposes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python_anonymity-0.0.1.tar.gz (24.2 MB view details)

Uploaded Source

Built Distribution

python_anonymity-0.0.1-py3-none-any.whl (6.1 MB view details)

Uploaded Python 3

File details

Details for the file python_anonymity-0.0.1.tar.gz.

File metadata

  • Download URL: python_anonymity-0.0.1.tar.gz
  • Upload date:
  • Size: 24.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.8

File hashes

Hashes for python_anonymity-0.0.1.tar.gz
Algorithm Hash digest
SHA256 ecb51029b9e70e1904b3fa824f9adaf415b1af578532b587e86446174c3146ab
MD5 8060784df839edd233b9c487fdd24f25
BLAKE2b-256 9d783854951456121a3c932ca8bbe710b1dbca4a306ad24cadc3459187102414

See more details on using hashes here.

File details

Details for the file python_anonymity-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for python_anonymity-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e62d6d47880d95292304fb4a10385ade935605bff4d195b537660cfa21c81fdd
MD5 8d9d0427b51cee88b98cf06c1a8a3610
BLAKE2b-256 01ad74fc255d138ad324b2f6488055c0af85b181937091e90eaed9b1347c8591

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page