Top Python PDF Libraries 2025

GitHub Libraries Python PDF Libraries

opendatalab/mineru 27K

Added by sizovs added 10 hours ago

A high-quality tool for convert PDF to Markdown and JSON.

ds4sd/docling 23K

Added by sizovs added 9 hours ago

Docling simplifies document processing, parsing diverse formats — including advanced PDF understanding — and providing seamless integrations with the gen AI ecosystem.

vikparuchuri/marker 22K

Added by sizovs added 9 hours ago

Marker PDF converts documents to markdown, JSON, and HTML quickly and accurately.

ocrmypdf/ocrmypdf 20K

Added by sizovs added 6 hours ago

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted.

py-pdf/pypdf 8K

Added by sizovs added 4 hours ago

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files

kozea/weasyprint 7K

Added by sizovs added 3 hours ago

WeasyPrint is a smart solution helping web developers to create PDF documents. It turns simple HTML pages into gorgeous statistical reports, invoices, tickets, etc.

Join libs.tech

...and unlock some superpowers

GitHub

We won't share your data with anyone else.