basic streaming text processing
Project description
pyin
====
[![Build Status](https://travis-ci.org/geowurster/pyin.svg?branch=master)](https://travis-ci.org/geowurster/pyin) [![Coverage Status](https://coveralls.io/repos/geowurster/pyin/badge.svg?branch=master)](https://coveralls.io/r/geowurster/pyin?branch=master)
Perform Python operations on every line read from `stdin`. Every line is
evaluated individually and available via a variable called `line`.
Installing
----------
Via pip:
$ pip install git+https://github.com/geowurster/pyin.git
From master branch:
$ git clone https://github.com/geowurster/pyin
$ pip install -e .
Examples
--------
Change newline character in a CSV.
$ more sample-data/csv-with-header.csv | pyin "line.replace('\n', '\r\n')" > output.csv
Extract a BigQuery schema from an existing table and pretty print it:
```console
$ bq show --format=json ${DATASET}.${TABLE} | pyin -m json -m pprint "pprint.pformat(json.loads(line)['schema']['fields'])"
[{u'mode': u'NULLABLE', u'name': u'mmsi', u'type': u'STRING'},
{u'mode': u'NULLABLE', u'name': u'longitude', u'type': u'FLOAT'},
{u'mode': u'NULLABLE', u'name': u'latitude', u'type': u'FLOAT'}
...]
```
Gotchas
-------
It's easy to completely modify the line content:
$ pyin -i sample-data/csv-with-header.csv "'operation'"
operationoperationoperationoperationoperationoperation
Forgetting to use `-t` to only get lines that evaluate as `True`:
$ pyin -i LICENSE.txt "'are' in line"
FalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalse
$ pyin -i LICENSE.txt "'are' in line" -t
modification, are permitted provided that the following conditions are met:
derived from this software without specific prior written permission.
Developing
----------
Install:
$ pip install virtualenv
$ git clone https://github.com/geowurster/pyin
$ cd pyin
$ virtualenv venv
$ source venv/bin/activate
$ pip install -r requirements-dev.txt
$ pip install -e .
Test:
$ nosetests
Coverage:
$ nosetests --with-coverage
Lint:
$ pep8 --max-line-length=120 pyin.py
====
[![Build Status](https://travis-ci.org/geowurster/pyin.svg?branch=master)](https://travis-ci.org/geowurster/pyin) [![Coverage Status](https://coveralls.io/repos/geowurster/pyin/badge.svg?branch=master)](https://coveralls.io/r/geowurster/pyin?branch=master)
Perform Python operations on every line read from `stdin`. Every line is
evaluated individually and available via a variable called `line`.
Installing
----------
Via pip:
$ pip install git+https://github.com/geowurster/pyin.git
From master branch:
$ git clone https://github.com/geowurster/pyin
$ pip install -e .
Examples
--------
Change newline character in a CSV.
$ more sample-data/csv-with-header.csv | pyin "line.replace('\n', '\r\n')" > output.csv
Extract a BigQuery schema from an existing table and pretty print it:
```console
$ bq show --format=json ${DATASET}.${TABLE} | pyin -m json -m pprint "pprint.pformat(json.loads(line)['schema']['fields'])"
[{u'mode': u'NULLABLE', u'name': u'mmsi', u'type': u'STRING'},
{u'mode': u'NULLABLE', u'name': u'longitude', u'type': u'FLOAT'},
{u'mode': u'NULLABLE', u'name': u'latitude', u'type': u'FLOAT'}
...]
```
Gotchas
-------
It's easy to completely modify the line content:
$ pyin -i sample-data/csv-with-header.csv "'operation'"
operationoperationoperationoperationoperationoperation
Forgetting to use `-t` to only get lines that evaluate as `True`:
$ pyin -i LICENSE.txt "'are' in line"
FalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalse
$ pyin -i LICENSE.txt "'are' in line" -t
modification, are permitted provided that the following conditions are met:
derived from this software without specific prior written permission.
Developing
----------
Install:
$ pip install virtualenv
$ git clone https://github.com/geowurster/pyin
$ cd pyin
$ virtualenv venv
$ source venv/bin/activate
$ pip install -r requirements-dev.txt
$ pip install -e .
Test:
$ nosetests
Coverage:
$ nosetests --with-coverage
Lint:
$ pep8 --max-line-length=120 pyin.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyin-0.2.1.tar.gz
(5.4 kB
view details)
File details
Details for the file pyin-0.2.1.tar.gz
.
File metadata
- Download URL: pyin-0.2.1.tar.gz
- Upload date:
- Size: 5.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d49ae6d473196cc2046adbe0429be40838c806f196a192da3d5d70efadaf6f28 |
|
MD5 | 71bd32843a459fe54f9ce50804cc6973 |
|
BLAKE2b-256 | dcdd5c518112cb2b71dc7df3c233786abb2d8b948d2d7930c946410757430213 |