Skip to main content

The game of Hex implemented for reinforcement learning in the OpenAI gym framework. Optimized for rollout speed.

Project description

MiniHex

An OpenAI gym environment that allows an agent to play the game of Hex. The aim for this environment is to be lean and have fast rollouts, as well as, variable board size. With random actions it currently achieves ~340 games per second in a 11x11 grid (original size) on a single CPU (Intel Xenon E3-1230 @3.3GHz).

Hex is a two player game and needs to be converted into a "single agent environment" to fit into the gym framework. We achieve this by requiring a opponent_policy at creation time. Each move of the agent will be immediately followed by a move of the opponent. This is a function that takes as input a board state and outputs an action.

Installation

pip install minihex

Editable installation (if you wish to tweak the environment):

git clone https://github.com/FirefoxMetzger/minihex.git
pip install -e minihex/

Minimal Working Example

import gym
import minihex


env = gym.make("hex-v0",
               opponent_policy=minihex.random_policy,
               board_size=11)

state, info = env.reset()
done = False
while not done:
    board, player = state
    action = minihex.random_policy(board, player, info)
    state, reward, done, info = env.step(action)

env.render()

if reward == -1:
    print("Player (Black) Lost")
elif reward == 1:
    print("Player (Black) Won")
else:
    print("Draw")

Debug Mode

If the environment is instantiated with debug=True each step will check if a valid action is provided, and an IndexError will be raised if an invalid one is provided. This is very useful while writing agents, e.g., if the agent maintains it's own belief over the environment and may request invalid actions. When evaluating/running at scale, however, this check can cause significant slowdown. Hence, it is only performed if explicitly requested.

Limitations

Currently the enviornment is missing the following features to go to version 1.0

  • The swap action that is used to mitigate the disadvantage of playing second.
  • RGB rendering mode
  • add environment to pypi
  • no surrender action

Bugs and Contributing

If you encounter problems, check the GitHub issue page or open a new issue there.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

minihex-1.0.0.tar.gz (4.8 kB view details)

Uploaded Source

Built Distribution

minihex-1.0.0-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file minihex-1.0.0.tar.gz.

File metadata

  • Download URL: minihex-1.0.0.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.0

File hashes

Hashes for minihex-1.0.0.tar.gz
Algorithm Hash digest
SHA256 ea9bdbe627bea62ebc6aa1277ceabae5ef7d4124761c16a880b96de3f52d3b35
MD5 6e45afe0078b147800051183c285534a
BLAKE2b-256 0a78236f7b3ec0ec7df0d707eb99f13876201abff1cf43a22d477698704b7a29

See more details on using hashes here.

File details

Details for the file minihex-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: minihex-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/49.6.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.0

File hashes

Hashes for minihex-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 10d1926db1d641675d487c930228ce9451b9b7d825b2a65907665e1a6f48f0ac
MD5 7e59e0325d711a8dc93e49c6e92b79ca
BLAKE2b-256 a6e805ccc6bce8c6681666078587cd5e1feb08de18187426ef76b1694ec86ea2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page