Trio driver for Chrome DevTools Protocol (CDP)

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python :: 3.7
Topic
- Software Development :: Libraries

Project description

Trio CDP

Python Versions MIT License

This Python library performs remote control of any web browser that implements the Chrome DevTools Protocol. It is built using the type wrappers in python-chrome-devtools-protocol and implements I/O using Trio. This library handles the WebSocket negotiation and session management, allowing you to transparently multiplex commands, responses, and events over a single connection.

The example demonstrates the salient features of the library.

async with open_cdp_connection(cdp_url) as conn:
    # Find the first available target (usually a browser tab).
    targets = await conn.execute(target.get_targets())
    target_id = targets[0].id

    # Create a new session with the chosen target.
    session = await conn.open_session(target_id)

    # Navigate to a website.
    await session.execute(page.enable())
    async with session.wait_for(page.LoadEventFired):
        await session.execute(page.navigate(target_url))

    # Extract the page title.
    root_node = await session.execute(dom.get_document())
    title_node_id = await session.execute(dom.query_selector(root_node.node_id,
        'title'))
    html = await session.execute(dom.get_outer_html(title_node_id))
    print(html)

We'll go through this example bit by bit. First, it starts with a context manager:

async with open_cdp_connection(cdp_url) as conn:

This context manager opens a connection to the browser when the block is entered and closes the connection automatically when the block exits. Now we have a connection to the browser, but the browser has multiple targets that can be operated independently. For example, each browser tab is a separate target. In order to interact with one of them, we have to create a session for it.

targets = await conn.execute(target.get_targets())
target_id = targets[0].id

The first line here executes the get_targets() command in the browser. Note the form of the command: await conn.execute(...) will send a command to the browser, parse its response, and return a value (if any). The command is one of the methods in the PyCDP package. Trio CDP multiplexes commands and responses on a single connection, so we can send commands concurrently if we want, and the responses will be routed back to the correct task.

In this case, the command is target.get_targets(), which returns a list of TargetInfo objects. We grab the first object and extract its target_id.

session = await conn.open_session(target_id)

In order to connect to a target, we open a session based on the target ID.

await session.execute(page.enable())
async with session.wait_for(page.LoadEventFired):
    await session.execute(page.navigate(target_url))

Here we use the session (remember, it corresponds to a tab in the browser) to navigate to the target URL. Just like the connection object, the session object has an execute(...) method that sends a command to the target, parses the response, and returns a value (if any).

This snippet also introduces another concept: events. When we ask the browser to navigate to a URL, it acknowledges our request with a response, then starts the navigation process. How do we know when the page is actually loaded, though? Easy: the browser can send us an event!

We first have to enable page-level events by calling page.enable(). Then we use session.wait_for(...) to wait for an event of the desired type. In this example, the script will suspend until it receives a page.LoadEventFired event. (After this block finishes executing, you can run page.disable() to turn off page-level events if you want to save some bandwidth and processing power, or you can use the context manager async with session.page_enable(): ... to automatically enable page-level events just for a specific block.)

Note that we wait for the event inside an async with block, and we do this before executing the command that will trigger this event. This order of operations may be surprising, but it avoids race conditions. If we executed a command and then tried to listen for an event, the browser might fire the event very quickly before we have had a chance to set up our event listener, and then we would miss it! The async with block sets up the listener before we run the command, so that no matter how fast the event fires, we are guaranteed to catch it.

root_node = await session.execute(dom.get_document())
title_node_id = await session.execute(
    dom.query_selector(root_node.node_id, 'title'))
html = await session.execute(dom.get_outer_html(title_node_id))
print(html)

The last part of the script navigates the DOM to find the <title> element. First we get the document's root node, then we query for a CSS selector, then we get the outer HTML of the node. This snippet shows some new APIs, but the mechanics of sending commands and getting responses are the same as the previous snippets.

A more complete version of this example can be found in examples/get_title.py. There is also a screenshot example in examples/screenshot.py. The unit tests in test/ also provide more examples.

To run the examples, you need a Chrome binary in your system. You can get one like this:

FOR MAC

Terminal 1

This sets up the chrome browser in a specific version, and runs it in debug mode with Tor proxy for network traffic.

wget https://www.googleapis.com/download/storage/v1/b/chromium-browser-snapshots/o/Mac%2F678035%2Fchrome-mac.zip?generation=1563322360871926&alt=media
unzip chrome-mac.zip && rm chrome-mac.zip
./chrome-mac/Chromium.app/Contents/MacOS/Chromium --remote-debugging-port=9000 
> DevTools listening on ws://127.0.0.1:9000/devtools/browser/<DEV_SESSION_GUID>

Terminal 2

This runs the example browser automation script on the instantiated browser window.

python examples/get_title.py ws://127.0.0.1:9000/devtools/browser/<DEV_SESSION_GUID> https://hyperiongray.com

FOR LINUX

Terminal 1

This sets up the chrome browser in a specific version, and runs it in debug mode with Tor proxy for network traffic.

wget https://storage.googleapis.com/chromium-browser-snapshots/Linux_x64/678025/chrome-linux.zip
unzip chrome-linux.zip && rm chrome-linux.zip
./chrome-linux/chrome --remote-debugging-port=9000 
> DevTools listening on ws://127.0.0.1:9000/devtools/browser/<DEV_SESSION_GUID>

Terminal 2

This runs the example browser automation script on the instantiated browser window.

python examples/get_title.py ws://127.0.0.1:9000/devtools/browser/<DEV_SESSION_GUID> https://hyperiongray.com

id=E89C70427E6B7D2F56365B3E4C2268AA id=CC6E9EA42D2FFBABEDCC4E3282EF2A74

Project details

These details have not been verified by PyPI

Project links

Homepage

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language
- Python :: 3.7
Topic
- Software Development :: Libraries

Release history Release notifications | RSS feed

0.6.0

Apr 14, 2020

0.5.0

Mar 13, 2020

This version

0.4.0

Feb 13, 2020

0.3.0

Dec 4, 2019

0.2.0

Sep 10, 2019

0.1.0

Aug 30, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

trio-chrome-devtools-protocol-0.4.0.tar.gz (12.0 kB view hashes)

Uploaded Feb 13, 2020 Source

Hashes for trio-chrome-devtools-protocol-0.4.0.tar.gz

Hashes for trio-chrome-devtools-protocol-0.4.0.tar.gz
Algorithm	Hash digest
SHA256	`4d953fe3d40d977653812b605987238a75c8c7ddea44c51d897696488275f716`
MD5	`40454abd681f33d0fc5bc848e82cf698`
BLAKE2b-256	`a81649d5e1d3c4d12d8c6a4ee920eea368508308637f692baea768724af68a8d`