PostgreSQL High-Available orchestrator and CLI
Project description
Patroni: A Template for PostgreSQL HA with ZooKeeper or etcd
Patroni was previously known as Governor.
There are many ways to run high availability with PostgreSQL; here we present a template for you to create your own custom fit high availability solution using python and distributed configuration store (like ZooKeeper or etcd) for maximum accessibility.
Getting Started
To get started, do the following from different terminals:
> etcd --data-dir=data/etcd > ./patroni.py postgres0.yml > ./patroni.py postgres1.yml
From there, you will see a high-availability cluster start up. Test different settings in the YAML files to see how behavior changes. Kill some of the different components to see how the system behaves.
Add more postgres*.yml files to create an even larger cluster.
We provide a haproxy configuration, which will give your application a single endpoint for connecting to the cluster’s leader. To configure, run:
> haproxy -f haproxy.cfg
> psql --host 127.0.0.1 --port 5000 postgres
How Patroni works
For a diagram of the high availability decision loop, see the included a PDF: postgres-ha.pdf
YAML Configuration
For an example file, see postgres0.yml. Below is an explanation of settings:
ttl: the TTL to acquire the leader lock. Think of it as the length of time before automatic failover process is initiated.
loop_wait: the number of seconds the loop will sleep
restapi
listen: ip address + port that Patroni will listen to provide health-check information for haproxy.
connect_address: ip address + port through which restapi is accessible.
etcd
scope: the relative path used on etcd’s http api for this deployment, thus you can run multiple HA deployments from a single etcd
ttl: the TTL to acquire the leader lock. Think of it as the length of time before automatic failover process is initiated.
host: the host:port for the etcd endpoint
zookeeper
scope: the relative path used on etcd’s http api for this deployment, thus you can run multiple HA deployments from a single etcd
session_timeout: the TTL to acquire the leader lock. Think of it as the length of time before automatic failover process is initiated.
reconnect_timeout: how long we should try to reconnect to ZooKeeper after connection loss. After this timeout we assume that we don’t have lock anymore and will restart in read-only mode.
hosts: list of ZooKeeper cluster members in format: [ ‘host1:port1’, ‘host2:port2’, ‘etc…’]
exhibitor: if you are running ZooKeeper cluster under Exhibitor supervisory the following section could be interesting for you
poll_interval: how often list of ZooKeeper and Exhibitor nodes should be updated from Exhibitor
port: Exhibitor port
hosts: initial list of Exhibitor (ZooKeeper) nodes in format: [ ‘host1’, ‘host2’, ‘etc…’ ]. This list would be updated automatically when Exhibitor (ZooKeeper) cluster topology changes.
postgresql
name: the name of the Postgres host, must be unique for the cluster
listen: ip address + port that Postgres listening. Must be accessible from other nodes in the cluster if using streaming replication.
connect_address: ip address + port through which Postgres is accessible from other nodes and applications.
data_dir: file path to initialize and store Postgres data files
maximum_lag_on_failover: the maximum bytes a follower may lag
use_slots: whether or not to use replication_slots. Must be False for PostgreSQL 9.3, and you should comment out max_replication_slots. before it is not eligible become leader
pg_hba: list of lines which should be added to pg_hba.conf
- host all all 0.0.0.0/0 md5
replication
username: replication username, user will be created during initialization
password: replication password, user will be created during initialization
network: network setting for replication in pg_hba.conf
callbacks callback scripts to run on certain actions. Patroni will pass current action, role and cluster name. See scripts/aws.py as an example on how to write them.
on_start: a script to run when the cluster starts
on_stop: a script to run when the cluster stops
on_restart: a script to run when the cluster restarts
on_reload: a script to run when configuration reload is triggered
on_role_change: a script to run when the cluster is being promoted or demoted
superuser
password: password for postgres user. It would be set during initialization
admin:
username: admin username, user will be created during initialization. It would have CREATEDB and CREATEROLE privileges
password: admin password, user will be created during initialization.
recovery_conf: additional configuration settings written to recovery.conf when configuring follower
parameters: list of configuration settings for Postgres. Many of these are required for replication to work.
Replication choices
Patroni uses Postgres’ streaming replication. By default, this replication is asynchronous. For more information, see the Postgres documentation on streaming replication.
Patroni’s asynchronous replication configuration allows for maximum_lag_on_failover settings. This setting ensures failover will not occur if a follower is more than a certain number of bytes behind the follower. This setting should be increased or decreased based on business requirements.
When asynchronous replication is not best for your use-case, investigate how Postgres’s synchronous replication works. Synchronous replication ensures consistency across a cluster by confirming that writes are written to a secondary before returning to the connecting client with a success. The cost of synchronous replication will be reduced throughput on writes. This throughput will be entirely based on network performance. In hosted datacenter environments (like AWS, Rackspace, or any network you do not control), synchrous replication increases the variability of write performance significantly. If followers become inaccessible from the leader, the leader will becomes effectively readonly.
To enable a simple synchronous replication test, add the follow lines to the parameters section of your YAML configuration files.
synchronous_commit: "on"
synchronous_standby_names: "*"
When using synchronous replication, use at least a 3-Postgres data nodes to ensure write availability if one host fails.
Choosing your replication schema is dependent on the many business decisions. Investigate both async and sync replication, as well as other HA solutions, to determine which solution is best for you.
Applications should not use superusers
When connecting from an application, always use a non-superuser. Patroni requires access to the database to function properly. By using a superuser from application, you can potentially use the entire connection pool, including the connections reserved for superusers with the superuser_reserved_connections setting. If Patroni cannot access the Primary, because the connection pool is full, behavior will be undesireable.
Requirements on a Mac
Run the following on a Mac to install requirements:
brew install postgresql etcd haproxy libyaml python pip install psycopg2 pyyaml
Notice
There are many different ways to do HA with PostgreSQL, see the PostgreSQL documentation for a complete list.
We call this project a “template” because it is far from a one-size fits all, or a plug-and-play replication system. It will have it’s own caveats. Use wisely.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file patroni-0.2.tar.gz
.
File metadata
- Download URL: patroni-0.2.tar.gz
- Upload date:
- Size: 28.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 490afa4cbce360da9a2cd2fb94218c451d1803bc0f82788dc47d5a624b97b005 |
|
MD5 | 1ff9f96b3eb1d07426f29ac98132f65b |
|
BLAKE2b-256 | 58236828de9cacc0efe203ec95e4221923aa55bed07fa40fe0dd292734ba7c21 |
Provenance
File details
Details for the file patroni-0.2-py3-none-any.whl
.
File metadata
- Download URL: patroni-0.2-py3-none-any.whl
- Upload date:
- Size: 32.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 904408dae67bb2eb4c4b4fdc13237748f6550e74cded5f89c95f76a67724f805 |
|
MD5 | 2919f7ed94af2f82b972cac5290e42df |
|
BLAKE2b-256 | a3c2fd0a3f0802e6021c8ca4a94bc7bb28c05de2735ddde1666ce9d8a0c7a235 |