A Python library for automating interaction with websites
Project description
Home page
Overview
A Python library for automating interaction with websites. MechanicalSoup automatically stores and sends cookies, follows redirects, and can follow links and submit forms. It doesn’t do JavaScript.
MechanicalSoup was created by M Hickford, who was a fond user of the Mechanize library. Unfortunately, Mechanize was incompatible with Python 3 until 2019 and its development stalled for several years. MechanicalSoup provides a similar API, built on Python giants Requests (for HTTP sessions) and BeautifulSoup (for document navigation). Since 2017 it is a project actively maintained by a small team including @hemberger and @moy.
Installation
PyPy3 is also supported (and tested against).
Download and install the latest released version from PyPI:
pip install MechanicalSoup
Download and install the development version from GitHub:
pip install git+https://github.com/MechanicalSoup/MechanicalSoup
Installing from source (installs the version in the current working directory):
python setup.py install
(In all cases, add --user to the install command to install in the current user’s home directory.)
Documentation
The full documentation is available on https://mechanicalsoup.readthedocs.io/. You may want to jump directly to the automatically generated API documentation.
Example
From examples/expl_qwant.py, code to get the results from a Qwant search:
"""Example usage of MechanicalSoup to get the results from the Qwant
search engine.
"""
import re
import mechanicalsoup
import html
import urllib.parse
# Connect to Qwant
browser = mechanicalsoup.StatefulBrowser(user_agent='MechanicalSoup')
browser.open("https://lite.qwant.com/")
# Fill-in the search form
browser.select_form('#search-form')
browser["q"] = "MechanicalSoup"
browser.submit_selected()
# Display the results
for link in browser.page.select('.result a'):
# Qwant shows redirection links, not the actual URL, so extract
# the actual URL from the redirect link:
href = link.attrs['href']
m = re.match(r"^/redirect/[^/]*/(.*)$", href)
if m:
href = urllib.parse.unquote(m.group(1))
print(link.text, '->', href)
More examples are available in examples/.
For an example with a more complex form (checkboxes, radio buttons and textareas), read tests/test_browser.py and tests/test_form.py.
Development
Instructions for building, testing and contributing to MechanicalSoup: see CONTRIBUTING.rst.
Common problems
Read the FAQ.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file MechanicalSoup-1.2.0.tar.gz
.
File metadata
- Download URL: MechanicalSoup-1.2.0.tar.gz
- Upload date:
- Size: 49.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f8593ab4474a82ceb6b7ddb3e35a8c69f8455a06adafd2b5258c1d852158d760 |
|
MD5 | 54039c0f140296e05e63882087054392 |
|
BLAKE2b-256 | 39220e9effb67fb2b360400193fe2e641b2362ce3d55e1aab06445235035cd7e |
File details
Details for the file MechanicalSoup-1.2.0-py3-none-any.whl
.
File metadata
- Download URL: MechanicalSoup-1.2.0-py3-none-any.whl
- Upload date:
- Size: 19.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e6161d5c715a7bc6fbdb2f1550c9df98e339ee67fed825d695a89da7ba461cac |
|
MD5 | 9623d2bab671fa0066fde6dfa2ee95c4 |
|
BLAKE2b-256 | 2b146a600b81e5adda3252d82123c2f8e2e6d0bb78dfc392f867fbe50830745c |