The Principal Dev – Masterclass for Tech Leads

The Principal Dev – Masterclass for Tech LeadsJuly 17-18

Join

Top Python HTML Libraries 2025

GitHub Libraries Python HTML Libraries

psf/requests-html 13K +4

added 1 month ago

This library intends to make parsing HTML as simple and intuitive as possible.

lxml/lxml 2K +5

added 1 month ago

lxml is the most feature-rich and easy-to-use library for processing XML and HTML in the Python language

buriy/python-readability 2K +4

added 1 month ago

Given an HTML document, extract and clean up the main body text and title.

mozilla/bleach 2K +3

added 1 month ago

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes

gawel/pyquery 2K +4

added 1 month ago

A jQuery-like library for python.

alir3z4/html2text 1K +36

added 1 month ago

Convert HTML to Markdown-formatted text.

scrapy/parsel 1K +4

added 1 month ago

Parsel lets you extract data from XML/HTML/JSON documents using XPath or CSS selectors.

html5lib/html5lib-python 1K +1

added 1 month ago

Standards-compliant library for parsing and serializing HTML documents and fragments in Python

Join libs.tech

...and unlock some superpowers

GitHub

We won't share your data with anyone else.