The Principal Dev – Masterclass for Tech Leads

The Principal Dev – Masterclass for Tech LeadsJuly 17-18

Join

Parsel

Tests

Supported Python versions

PyPI Version

Coverage report

Parsel is a BSD-licensed Python library to extract data from HTML, JSON, and XML documents.

It supports:

Find the Parsel online documentation at https://parsel.readthedocs.org.

Example (open online demo):

>>> from parsel import Selector
>>> text = """
... <html>
...     <body>
...         <h1>Hello, Parsel!</h1>
...         <ul>
...             <li><a href="http://example.com">Link 1</a></li>
...             <li><a href="http://scrapy.org">Link 2</a></li>
...         </ul>
...         <script type="application/json">{"a": ["b", "c"]}</script>
...     </body>
... </html>"""
>>> selector = Selector(text=text)
>>> selector.css("h1::text").get()
'Hello, Parsel!'
>>> selector.xpath("//h1/text()").re(r"\w+")
['Hello', 'Parsel']
>>> for li in selector.css("ul > li"):
...     print(li.xpath(".//@href").get())
...
http://example.com
http://scrapy.org
>>> selector.css("script::text").jmespath("a").get()
'b'
>>> selector.css("script::text").jmespath("a").getall()
['b', 'c']

Join libs.tech

...and unlock some superpowers

GitHub

We won't share your data with anyone else.