Skip to main content

Smart text extraction from PDF documents

Project description

Tests Documentation PyPI Codecov DOI

EDS-PDF

EDS-PDF provides modular framework to extract text from PDF documents.

You can use it out-of-the-box, or extend it to fit your use-case.

Getting started

Install the library with pip:

$ pip install edspdf

Visit the documentation for more information!

Citation

If you use EDS-PDF, please cite us as below.

@software{edspdf,
  author  = {Dura, Basile and Wajsburt, Perceval and Calliger, Alice and Gérardin, Christel and Bey, Romain},
  doi     = {10.5281/zenodo.6902977},
  license = {BSD-3-Clause},
  title   = {{EDS-PDF: Smart text extraction from PDF documents}},
  url     = {https://github.com/aphp/edspdf}
}

Acknowledgement

We would like to thank Assistance Publique – Hôpitaux de Paris and AP-HP Foundation for funding this project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edspdf-0.7.0.tar.gz (66.9 kB view details)

Uploaded Source

File details

Details for the file edspdf-0.7.0.tar.gz.

File metadata

  • Download URL: edspdf-0.7.0.tar.gz
  • Upload date:
  • Size: 66.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for edspdf-0.7.0.tar.gz
Algorithm Hash digest
SHA256 698329b5221dc17cc25428655f3e284bf85c2b2caaab8c74ae38e419a9c58133
MD5 5e47202698e5d1a1c8e3908fd0a337f8
BLAKE2b-256 8b7dff5369d21663d9e0e644bea890df2ea7c9ee8ededb2a17dd6dd644908f10

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page