Skip to main content

BrowserGym: a gym environment for web task automation in the Chromium browser

Project description

BrowserGym: a Gym Environment for Web Task Automation

[Setup][Usage][Demo]

This package provides browsergym, a gym environment for web task automation in the Chromium browser.

https://github.com/ServiceNow/BrowserGym/assets/26232819/e0bfc788-cc8e-44f1-b8c3-0d1114108b85

Example of a GPT4-V agent executing openended tasks (top row, chat interactive), as well as WebArena and WorkArena tasks (bottom row)

BrowserGym includes the following benchmarks by default:

Designing new web benchmarks with BrowserGym is easy, and simply requires to inherit the AbstractBrowserTask class.

Setup

To install browsergym, you can either install one of the browsergym-miniwob, browsergym-webarena and browsergym-workarena packages, or you can simply install browsergym which includes all of these by default.

pip install browsergym

Then, a required step is to setup playwright by running

playwright install

Finally, each benchmark comes with its own specific setup that requires to follow additional steps.

Development setup

To install browsergym locally for development, use the following commands:

git clone https://github.com/ServiceNow/BrowserGym.git

cd ~/PATH/TO/REPOSITORY/BrowserGym

make install

Usage

Open-ended task example

Boilerplate code to run an agent on an interactive, open-ended task:

import gymnasium as gym
import browsergym.core  # register the openended task as a gym environment

env = gym.make(
    "browsergym/openended", task_kwargs={"start_url": "https://www.google.com/"}, wait_for_user_message=True
)
obs, info = env.reset()
done = False
while not done:
    action = ...  # implement your agent here
    obs, reward, terminated, truncated, info = env.step(action)

MiniWoB++ task example

Boilerplate code to run an agent on a MiniWoB++ task:

import gymnasium as gym
import browsergym.miniwob  # register miniwob tasks as gym environments

env = gym.make("browsergym/miniwob.choose-list")
obs, info = env.reset()
done = False
while not done:
    action = ...  # implement your agent here
    obs, reward, terminated, truncated, info = env.step(action)

List of all the available MiniWoB++ environments

env_ids = [id for id in gym.envs.registry.keys() if id.startswith("browsergym/miniwob")]
print("\n".join(env_ids))

WebArena task example

Boilerplate code to run an agent on a WebArena task:

import gymnasium as gym
import browsergym.webarena  # register webarena tasks as gym environments

env = gym.make("browsergym/webarena.310")
obs, info = env.reset()
done = False
while not done:
    action = ...  # implement your agent here
    obs, reward, terminated, truncated, info = env.step(action)

List of all the available WebArena environments

env_ids = [id for id in gym.envs.registry.keys() if id.startswith("browsergym/webarena")]
print("\n".join(env_ids))

WorkArena task example

Boilerplate code to run an agent on a WorkArena task:

import gymnasium as gym
import browsergym.workarena  # register workarena tasks as gym environments

env = gym.make("browsergym/workarena.servicenow.order-ipad-pro")
obs, info = env.reset()
done = False
while not done:
    action = ...  # implement your agent here
    obs, reward, terminated, truncated, info = env.step(action)

List of all the available WorkArena environments

env_ids = [id for id in gym.envs.registry.keys() if id.startswith("browsergym/workarena")]
print("\n".join(env_ids))

Demo

If you want to experiment with an agent in BrowserGym, follow these steps:

cd demo-agent
conda env create -f environment.yml; conda activate demo-agent
# or simply use `pip install -r requirements.txt`
playwright install

Optional: Set your OPENAI_API_KEY if you want to use a GPT agent.

Launch the demo on the open web:

python run_demo.py --task_name openended --start_url https://www.google.com

You can customize your experience by changing the model_name to your preferred LLM, toggling Chain-of-thought with use_thinking, adding screenshots for your VLMs with use_screenshot, and much more!

Citing This Work

Please use the following BibTeX to cite our work:

@misc{workarena2024,
      title={WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?}, 
      author={Alexandre Drouin and Maxime Gasse and Massimo Caccia and Issam H. Laradji and Manuel Del Verme and Tom Marty and Léo Boisvert and Megh Thakkar and Quentin Cappart and David Vazquez and Nicolas Chapados and Alexandre Lacoste},
      year={2024},
      eprint={2403.07718},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browsergym-0.4.0.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

browsergym-0.4.0-py3-none-any.whl (3.7 kB view details)

Uploaded Python 3

File details

Details for the file browsergym-0.4.0.tar.gz.

File metadata

  • Download URL: browsergym-0.4.0.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for browsergym-0.4.0.tar.gz
Algorithm Hash digest
SHA256 a7dfebf133d81fe25d69072e9749bea0d0bc04065169ef3294eadb4ca8a263b1
MD5 65fa2d3e1313872ae59c3bc04dee0d7f
BLAKE2b-256 36773acd2b374cf2b695a5f711418a9ac945ca7dd6ca6e52cdb89123bd5a0ba5

See more details on using hashes here.

File details

Details for the file browsergym-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: browsergym-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 3.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for browsergym-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fbf6e766f8a57f309a5f4762f582aa3ce52bb2160542a09ccb42e718a35681fe
MD5 f9dbc630444ba945824a68b7efaf3242
BLAKE2b-256 e330313dec1d4de07ee0a0c0f5edcde5d0341c30e1ac458e167969408d498a8a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page