Skip to main content

Data package representation for cell migration tracking data

Project description

This Python project aims to create a simple Python package to produce data packages of cell migration tracking files. The final goal is to have a uniform, standardized way to represent these data, as in Frictionless Data and Data Packages .

Steps to follow to use the package:

  • step 1 - Install the package (note it’s Python 3 only at the moment):

python setup.py install
  • step 2 - create a biotracks.ini configuration file and place it in the same directory as your tracking file. The file must be structured as follows:

[TOP_LEVEL_INFO]
author = the author of the dp
title = a title describing the dp
name = a name for the dp
author_institute = the insitute of the author
author_email = a valid email address

[TRACKING_DATA]
x_coord_cmso = the column name pointing to the x coordinate
y_coord_cmso = the column name pointing to the y coordinate
z_coord_cmso = the column name pointing to the z coordinate
frame_cmso = the column name pointing to the frame information
object_id_cmso = the object identifier
link_id_cmso = the link identifier
  • step 3 - move to the scripts directory and run:

python create_dpkg.py your_tracking_file

this will create a dp directory containing:

  • a csv file for the objects (i.e. cells)

  • a csv file for the links (i.e. a linear collection of objects)

  • and a dp.json file containing the json schemas of the two csv files.

This last file will look something like this:

{
    "resources": [{
        "name": "objects_table",
        "schema": {
            "primaryKey": "SPOT_ID",
            "fields": [{
                "name": "SPOT_ID",
                "title": "",
                "description": "",
                "constraints": {
                    "unique": true
                },
                "type": "integer",
                "format": "default"
            }, {
                "type": "integer",
                "name": "FRAME",
                "title": "",
                "format": "default",
                "description": ""
            }, {
                "type": "number",
                "name": "POSITION_X",
                "title": "",
                "format": "default",
                "description": ""
            }, {
                "type": "number",
                "name": "POSITION_Y",
                "title": "",
                "format": "default",
                "description": ""
            }]
        },
        "path": "objects.csv"
    }, {
        "name": "links_table",
        "schema": {
            "foreignKeys": [{
                "fields": "SPOT_ID",
                "reference": {
                    "resource": "objects_table",
                    "fields": "SPOT_ID",
                    "datapackage": ""
                }
            }],
            "fields": [{
                "type": "integer",
                "name": "LINK_ID",
                "title": "",
                "format": "default",
                "description": ""
            }, {
                "type": "integer",
                "name": "SPOT_ID",
                "title": "",
                "format": "default",
                "description": ""
            }]
        },
        "path": "links.csv"
    }],
    "name": "CMSO_tracks",
    "title": "A CMSO data package representation of cell tracking data",
    "author_email": "paola.masuzzo@email.com",
    "author_institute": "VIB",
    "author": "paola masuzzo"
}

Then, the datapackage is pushed to a pandas dataframe. At the moment, this dataframe is used to create simple visualizations of links and turning angles.

Project details


Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page