The Principal Dev – Masterclass for Tech Leads

The Principal Dev – Masterclass for Tech LeadsJuly 17-18

Join

Top Python PDF Libraries 2025

GitHub Libraries Python PDF Libraries

opendatalab/mineru 32K +1411

added 3 months ago

A high-quality tool for convert PDF to Markdown and JSON.

ocrmypdf/ocrmypdf 29K +178

added 3 months ago

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted.

docling-project/docling 28K +1246

added 3 months ago

Docling simplifies document processing, parsing diverse formats — including advanced PDF understanding — and providing seamless integrations with the gen AI ecosystem.

vikparuchuri/marker 24K +357

added 3 months ago

Marker PDF converts documents to markdown, JSON, and HTML quickly and accurately.

py-pdf/pypdf 9K +31

added 3 months ago

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files

kozea/weasyprint 7K +46

added 3 months ago

WeasyPrint is a smart solution helping web developers to create PDF documents. It turns simple HTML pages into gorgeous statistical reports, invoices, tickets, etc.

Join libs.tech

...and unlock some superpowers

GitHub

We won't share your data with anyone else.