Skip to main content

Find and extract content in PDFs converted to XML

Project description

# PDFCutter

There are better ways than storing data in a PDF.
**pdfcutter** is for when you need to get it out again.

Works on XML output of `pdftohtml` which belongs to `poppler-utils`.


```python

import pdfcutter

cutter = pdfcutter.PDFCutter(filename='./some.pdf')

name_label = cutter.filter(page=1, search='Name:')
name = cutter.filter(page=1).strictly_right_of(name_label).text()
```



Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pdfcutter-0.0.1.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

pdfcutter-0.0.1-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file pdfcutter-0.0.1.tar.gz.

File metadata

  • Download URL: pdfcutter-0.0.1.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.5

File hashes

Hashes for pdfcutter-0.0.1.tar.gz
Algorithm Hash digest
SHA256 178fce2b6cf8f27bb9ec8b21207c41c38af481ca23f266d61ee29f8444f0ed19
MD5 1bb680cf315e0983202d3c0463f5e9fb
BLAKE2b-256 29cb73d52fd296d38bd45846a497a0bfc62a43b5cc2119d12cbf397228e6b23d

See more details on using hashes here.

File details

Details for the file pdfcutter-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: pdfcutter-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 8.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.18.4 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.5

File hashes

Hashes for pdfcutter-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 99447f47302afcdb2fa60fc7012e367c7721be975042c89be55f5ce8a6889bbe
MD5 2b3bb4ffbfcc66206ec040134b1e0221
BLAKE2b-256 f6d23f62e276c25f57dfabeed74b6245fab67ba33d9b188cabff501608ffce87

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page