Set Up a Highly Available PostgreSQL Release 14 Cluster on Rocky Linux 9

This guide will walk you through the steps to set up a fault-tolerant, scalable and highly available patroni cluster with postgres database release 14 on Rocky Linux 9. These steps can also be applied if you are running an earlier or later release of Rocky Linux, CentOS, Fedora, and RHEL in your environment.

The main purpose of this guide is to make PostgreSQL database highly available, and to avoid any single point of failure.

Patroni high availability cluster is comprised of the following components:

  • PostgreSQL database is a powerful, open source object-relational database system with over 35 years of active development that has earned it a strong reputation for reliability, feature robustness, and performance.
  • Patroni is a cluster manager used to customize and automate deployment and maintenance of PostgreSQL HA (High Availability) clusters.
  • PgBouncer is an open-source, lightweight, single-binary connection pooler for PostgreSQL. PgBouncer maintains a pool of connections for each unique user, database pair. It’s typically configured to hand out one of these connections to a new incoming client connection, and return it back in to the pool when the client disconnects.
  • etcd is a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. We will use etcd to store the state of the PostgreSQL cluster.
  • HAProxy is a free and open source software that provides a high availability load balancer and reverse proxy for TCP and HTTP-based applications that spreads requests across multiple servers.
  • Keepalived implements a set of health checkers to dynamically and adaptively maintain and manage load balanced server pools according to their health. When designing load balanced topologies, it is important to account for the availability of the load balancer itself as well as the real servers behind it. 

All of these patroni cluster components need to be installed on Linux servers. Patroni software shares the server with the PostgreSQL database.

 

Patroni Cluster Hardware Requirements

The following sections will help you to understand the least minimum system requirements for the various components of the patroni cluster.

Patroni/Postgres system requirements:
2 Core CPU
4 GB RAM
80 GB HDD

For extensive usage, it is recommended to assign 100 GB to PGDATA directory, or even more disk space according to your data retention policy.

ETCD system requirements:
2 Core CPU
4 GB RAM
80 GB  SSD

Since etcd is iops heavy, and writes data to disk, its performance entirely depends on disk performance. For this particular reason, it is highly recommended to use dedicated SSD storage to achieve best performance.

PgBouncer system requirements:
2 Core CPU
4 GB RAM
20 GB HDD

PgBouncer is just a connection pooler for postgres database, and it doesn't require much disk space.

HAProxy system requirements:
2 Core CPU
4 GB RAM
80 GB HDD

The above HAProxy system requirements is given based on an average of 1000 connections per second. For a larger number of connections, the memory and CPU have to be increased in the ratio of 1:2 (memory in GB = 2 x number of CPU cores). The CPU cores have to be as fast as possible for better HAProxy  performance.

 

Number of nodes requirements for patroni cluster

It is recommended to have an odd number of nodes in a patroni cluster. An odd-size cluster tolerates the same number of failures as an even-size cluster but with fewer nodes. The difference can be seen by comparing even and odd sized clusters:

Cluster SizeMajorityFailure Tolerance
110
220
321
431
532
642
743
853
954

The same odd number of nodes recommendation goes for etcd cluster as well. Safely, you can say 3-nodes cluster is better, but 5-nodes cluster is good to go.

 

Production environment for patroni cluster

You can take an example of the following recommended set up to build a highly available patroni cluster for your production use.

Solution 1:

  • Take 3-nodes for patroni cluster - install (postgres, patroni, pgbouncer, haproxy, keepalived) on these three machines.
  • Take 3-nodes for etcd cluster - install etcd on these three machines.

This will become a 6-nodes patroni cluster, which you can scale up in future as well by adding more nodes into your patroni and etcd cluster to meet your production needs.

Solution 2:

  • Take 3-nodes for patroni cluster - install (patroni+postgres) on these three machines.
  • Take 3-nodes for etcd cluster - install only (etcd) on these three machines.
  • Take 2-nodes for HAProxy - install (haproxy+keepalived+pgbouncer)on these two machines.

This will become a 8-nodes patroni cluster which you can scale up in future as well by adding more nodes into your patroni and etcd cluster to meet your production needs.

 

Basic environment for patroni cluster

For the demonstration purpose, we will start with the basic environment to set up a 3-node patroni cluster on three separate virtual machines:

HOSTNAMEIP ADDRESSSERVICESSHARED IP
patroni1192.168.10.1
Patroni, PostgreSQL, PgBouncer, Etcd, HAProxy, Keepalived
192.168.10.200
patroni2192.168.10.2
patroni3192.168.10.3

We will install all the patroni cluster components on these three virtual machines.

Now that you have understood patroni cluster components and its requirements, so lets begin with the following steps to build a highly available patroni cluster with postgres database version 14 on Rocky Linux release 9.


Prepare your Linux server to run patroni cluster

Log in to your Linux server using a non-root user with sudo privileges and perform the following steps.
 
Set correct timezone on each node:
 
sudo timedatectl set-timezone Asia/Karachi
Set hostname on each node:

sudo hostnamectl set-hostname patroni1
Edit /etc/hosts file on each node:
 
sudo nano /etc/hosts
Map IP Addresses against the each node's name:
 
192.168.10.1 patroni1
192.168.10.2 patroni2
192.168.10.3 patroni3
Save and close the editor when you are finished.

Make sure you repeat the same on each node before proceeding to next.

 

Disable SELINUX

If you are familiar with SELINUX configuration, you should keep it as is. If you do not have much expertise, it is recommended to change SELINUX=enforcing to SELINUX=disabled by editing /etc/selinux/config file.
 
Edit /etc/selinux/config file with any of your favorite text editor:
 
sudo nano /etc/selinux/config
Change SELINUX=enforcing to SELINUX=disabled

SELINUX=disabled
Save and close the editor when you are finished.

Make sure you repeat the same on each node before proceeding to next. When you are finished, reboot each node to make the changes effect:
 
sudo shutdown -r now

Configure Firewalld

The ports required for operating PostgreSQL HA cluster using (patroni, pgbouncer, etcd, haproxy, keepalived) are the following:

  • 5432 Postgres database standard port.
  • 6432 PgBouncer standard port.
  • 8008 patroni rest api port required by HAProxy to check the nodes status.
  • 2379 etcd client port required by any client including patroni to communicate with etcd cluster.
  • 2380 etcd peer urls port required by the etcd cluster members communication.
  • 5000 HAProxy front-end listening port, required to establish connection to the back-end masater database server via pgbouncer port 6432.
  • 5001 HAProxy front-end listening port, required to establish connection to the back-end replica database servers via pgbouncer port 6432
  • 7000 HAProxy stats dashboard, required to access HAProxy web interface using HTTP.

You can allow these required ports from firewalld using the following command:

sudo firewall-cmd --zone=public --add-port=5432/tcp --permanent
sudo firewall-cmd --zone=public --add-port=6432/tcp --permanent
sudo firewall-cmd --zone=public --add-port=8008/tcp --permanent
sudo firewall-cmd --zone=public --add-port=2379/tcp --permanent
sudo firewall-cmd --zone=public --add-port=2380/tcp --permanent
sudo firewall-cmd --permanent --zone=public --add-service=http
sudo firewall-cmd --zone=public --add-port=5000/tcp --permanent
sudo firewall-cmd --zone=public --add-port=5001/tcp --permanent
sudo firewall-cmd --zone=public --add-port=7000/tcp --permanent
sudo firewall-cmd --add-rich-rule='rule protocol value="vrrp" accept' --permanent
sudo firewall-cmd --reload
Make sure you repeat the same on each node before proceeding to next.   
 

Install Required Repository

Type below command to install EPEL repo on your Rocky Linux servers:
 
sudo dnf install -y epel-release

sudo dnf install -y yum-utils
Make sure you repeat the same on each node before proceeding to next.

Install PostgreSQL

Type below command to install PostgreSQL database release 14 on Rocky Linux Servers: 
 
sudo dnf -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-x86_64/pgdg-redhat-repo-latest.noarch.rpm

sudo dnf config-manager --enable pgdg14

sudo dnf module disable -y postgresql

sudo dnf -y install postgresql14-server postgresql14 postgresql14-devel

sudo ln -s /usr/pgsql-14/bin/* /usr/sbin/
Make sure you repeat the same on each node before proceeding to next.

Install etcd

The etcd package is not available in Rocky Linux default repositories. To install etcd using dnf package manager, first you need to create a repo like below:

sudo nano /etc/yum.repos.d/etcd.repo
Add following:
 
[etcd]
name=PostgreSQL common RPMs for RHEL / Rocky $releasever - $basearch
baseurl=http://ftp.postgresql.org/pub/repos/yum/common/pgdg-rhel9-extras/redhat/rhel-$releasever-$basearch
enabled=1
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-PGDG
repo_gpgcheck = 1
Save and close the editor when you are finished.
 
Type following command to install etcd on your Rocky Linux servers:
 
sudo dnf makecache

sudo dnf install -y etcd
Make sure you repeat the same on each node before proceeding to next.
 

Configure etcd Cluster

Edit /etc/etcd/etcd.conf file on your first node (patroni1) in our case, to make the required changes:

sudo mv /etc/etcd/etcd.conf /etc/etcd/etcd.conf.orig

sudo nano /etc/etcd/etcd.conf
Add following configuration: 
 
ETCD_NAME=patroni1
ETCD_DATA_DIR="/var/lib/etcd/patroni1"
ETCD_LISTEN_PEER_URLS="http://192.168.10.1:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.10.1:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.10.1:2380"
ETCD_INITIAL_CLUSTER="patroni1=http://192.168.10.1:2380,patroni2=http://192.168.10.2:2380,patroni3=http://192.168.10.3:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.10.1:2379"
ETCD_ENABLE_V2="true"
Do not forget to replace highlighted text with yours. Save and close the editor when you are finished.
 
Edit /etc/etcd/etcd.conf file on your second node (patroni2) in our case, to make the required changes:

sudo mv /etc/etcd/etcd.conf /etc/etcd/etcd.conf.orig

sudo nano /etc/etcd/etcd.conf
Add following configuration:
 
ETCD_NAME=patroni2
ETCD_DATA_DIR="/var/lib/etcd/patroni2"
ETCD_LISTEN_PEER_URLS="http://192.168.10.2:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.10.2:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.10.2:2380"
ETCD_INITIAL_CLUSTER="patroni1=http://192.168.10.1:2380,patroni2=http://192.168.10.2:2380,patroni3=http://192.168.10.3:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.10.2:2379"
ETCD_ENABLE_V2="true"
Do not forget to replace highlighted text with yours. Save and close the editor when you are finished.
 
Edit /etc/etcd/etcd.conf file on your third node (patroni3) in our case, to make the required changes:

sudo mv /etc/etcd/etcd.conf /etc/etcd/etcd.conf.orig

sudo nano /etc/etcd/etcd.conf
Add following configuration:
 
ETCD_NAME=patroni3
ETCD_DATA_DIR="/var/lib/etcd/patroni3"
ETCD_LISTEN_PEER_URLS="http://192.168.10.3:2380"
ETCD_LISTEN_CLIENT_URLS="http://192.168.10.3:2379"
ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.10.3:2380"
ETCD_INITIAL_CLUSTER="patroni1=http://192.168.10.1:2380,patroni2=http://192.168.10.2:2380,patroni3=http://192.168.10.3:2380"
ETCD_INITIAL_CLUSTER_STATE="new"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="http://192.168.10.3:2379"
ETCD_ENABLE_V2="true"
Do not forget to replace highlighted text with yours. Save and close the editor when you are finished. 
 
Edit .bash_profile on each node:
 
cd ~
nano .bash_profile
Add etcd environment variables at the end of the file:
 
export PGDATA="/var/lib/pgsql/14/data"
export ETCDCTL_API="3"
export PATRONI_ETCD_URL="http://127.0.0.1:2379"
export PATRONI_SCOPE="pg_cluster"
patroni1=192.168.10.1
patroni2=192.168.10.2
patroni3=192.168.10.3
ENDPOINTS=$patroni1:2379,$patroni2:2379,$patroni3:2379

Do not forget to replace highlighted text with yours. Save and close the editor when you are finished.

Make sure you repeat the same on each node before proceeding to next.

 

Start etcd Cluster

Type below command simultaneously on each node (patroni1, patroni2, patroni3) to start etcd cluster:

sudo systemctl start etcd
Verify etcd cluster from any of your nodes using the following command:
 
source ~/.bash_profile

etcdctl member list

etcdctl endpoint status --write-out=table --endpoints=$ENDPOINTS
You will see the output similar to like as shown in image below:
 

Make sure you etcd cluster is up and running as described before proceeding to next step.

 

Install Patroni

Type below command to install patroni on your first node (patroni1) in our case:

sudo dnf -y install python3 python3-devel python3-pip gcc libpq-devel

sudo pip3 install --upgrade testresources --upgrade setuptools psycopg2 python-etcd

sudo dnf -y install patroni patroni-etcd watchdog

Make sure you repeat the same on each node before proceeding to next.

 

Configure Patroni Cluster

By default Patroni configures PostgreSQL database for asynchronous replication using PostgreSQL streaming replication method. Choosing your replication schema is dependent on your production environment.

Synchronous vs Asynchronous replication:
PostgreSQL supports both synchronous and asynchronous replication for high availability and disaster recovery.
  • Synchronous replication means that a write operation is only considered complete once it has been confirmed as written to the master server as well as one or more synchronous standby servers. This provides the highest level of data durability and consistency, as clients are guaranteed that their writes have been replicated to at least one other server before the write is acknowledged as successful. However, the tradeoff is that the performance of the master server is impacted by the time it takes to confirm the write to the synchronous standby servers, which can be a bottleneck in high-write environments. 
  • Asynchronous replication, on the other hand, means that a write operation is acknowledged as successful as soon as it is written to the master server, without waiting for any confirmations from standby servers. This provides better performance as the master server is not blocked by the replication process, but the tradeoff is that there is a risk of data loss if the master server fails before the write is replicated to the standby servers.
PostgreSQL allows you to configure one or more synchronous replicas and any number of asynchronous replicas, so you can balance the performance and data durability trade-offs according to your needs.

For more information about async, and sync replications, see the Postgres documentation as well as Patroni documentation to determine which replication solution is best for your production need.

For this guide, we will use default asynchronous replication mode.

To configure patroni cluster, you need to create a patroni.yml file in /etc/patroni location on your first node (patroni1) in our case:

sudo mkdir -p /etc/patroni

sudo nano /etc/patroni/patroni.yml
Add following configuration:
 
scope: pg_cluster
namespace: /service/
name: patroni1

restapi:
listen: 192.168.10.1:8008
connect_address: 192.168.10.1:8008

etcd:
hosts: 192.168.10.1:2379,192.168.10.2:2379,192.168.10.3:2379

bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
use_slots: true
parameters:

initdb:
- encoding: UTF8
- data-checksums

pg_hba:
- host replication replicator 127.0.0.1/32 md5
- host replication replicator 192.168.10.1/32 md5
- host replication replicator 192.168.10.2/32 md5
- host replication replicator 192.168.10.3/32 md5
- host all all 0.0.0.0/0 md5

users:
admin:
password: admin
options:
- createrole
- createdb

postgresql:
listen: 192.168.10.1:5432
connect_address: 192.168.10.1:5432
data_dir: /var/lib/pgsql/14/data
bin_dir: /usr/pgsql-14/bin
pgpass: /tmp/pgpass
authentication:
replication:
username: replicator
password: replicator
superuser:
username: postgres
password: postgres

watchdog:
  mode: required
  device: /dev/watchdog
  safety_margin: 5

tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false

Make sure, you replace the highlighted text with yours. Save and close the editor when you are finished.

Next, create a patroni.yml file on your second node (patroni2) in our case:

sudo mkdir -p /etc/patroni

sudo nano /etc/patroni/patroni.yml

 Add following configuration:

scope: pg_cluster
namespace: /service/
name: patroni2

restapi:
listen: 192.168.10.2:8008
connect_address: 192.168.10.2:8008

etcd:
hosts: 192.168.10.1:2379,192.168.10.2:2379,192.168.10.3:2379

bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
use_slots: true
parameters:

initdb:
- encoding: UTF8
- data-checksums

pg_hba:
- host replication replicator 127.0.0.1/32 md5
- host replication replicator 192.168.10.1/32 md5
- host replication replicator 192.168.10.2/32 md5
- host replication replicator 192.168.10.3/32 md5
- host all all 0.0.0.0/0 md5

users:
admin:
password: admin
options:
- createrole
- createdb

postgresql:
listen: 192.168.10.2:5432
connect_address: 192.168.10.2:5432
data_dir: /var/lib/pgsql/14/data
bin_dir: /usr/pgsql-14/bin
pgpass: /tmp/pgpass
authentication:
replication:
username: replicator
password: replicator
superuser:
username: postgres
password: postgres

watchdog:
  mode: required
  device: /dev/watchdog
  safety_margin: 5

tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false
Make sure, you replace the highlighted text with yours. Save and close the editor when you are finished.

Next, create a patroni.yml file on your third node (patroni3) in our case:

sudo mkdir -p /etc/patroni

sudo nano /etc/patroni/patroni.yml

Add following configuration:

scope: pg_cluster
namespace: /service/
name: patroni3

restapi:
listen: 192.168.10.3:8008
connect_address: 192.168.10.3:8008

etcd:
hosts: 192.168.10.1:2379,192.168.10.2:2379,192.168.10.3:2379

bootstrap:
dcs:
ttl: 30
loop_wait: 10
retry_timeout: 10
maximum_lag_on_failover: 1048576
postgresql:
use_pg_rewind: true
use_slots: true
parameters:

initdb:
- encoding: UTF8
- data-checksums

pg_hba:
- host replication replicator 127.0.0.1/32 md5
- host replication replicator 192.168.10.1/32 md5
- host replication replicator 192.168.10.2/32 md5
- host replication replicator 192.168.10.3/32 md5
- host all all 0.0.0.0/0 md5

users:
admin:
password: admin
options:
- createrole
- createdb

postgresql:
listen: 192.168.10.3:5432
connect_address: 192.168.10.3:5432
data_dir: /var/lib/pgsql/14/data
bin_dir: /usr/pgsql-14/bin
pgpass: /tmp/pgpass
authentication:
replication:
username: replicator
password: replicator
superuser:
username: postgres
password: postgres

watchdog:
  mode: required
  device: /dev/watchdog
  safety_margin: 5

tags:
nofailover: false
noloadbalance: false
clonefrom: false
nosync: false

Make sure, you replace the highlighted text with yours. Save and close the editor when you are finished.

 

Enable Software Watchdog

Watchdog devices are software or hardware mechanisms that will reset the whole system when they do not get a heartbeat within a specified timeframe. This adds an additional layer of fail safe in case usual Patroni split-brain protection mechanisms fail.

Patroni will try to activate the watchdog before promoting PostgreSQL to primary. Currently watchdogs are only supported using Linux watchdog device interface. For more information about watchdog, see the Patroni documentation.

Default Patroni configuration will try to use /dev/watchdog on Linux if it is accessible to Patroni. For most use cases using software watchdog built into the Linux kernel is secure enough.
 
Edit /etc/watchdog.conf to enable software watchdog:
 
sudo nano /etc/watchdog.conf
Locate, and uncomment following line:
 
watchdog-device = /dev/watchdog
Save and close the editor when you are finished.

Execute following commands to activate software watchdog:
 
sudo mknod /dev/watchdog c 10 130

sudo modprobe softdog

sudo chown postgres /dev/watchdog
Make sure you repeat the same on each node before proceeding to next.
 

Start Patroni Cluster

Type below command on your (patroni1) to start your first patroni cluster node:
 
sudo systemctl start patroni
Verify patroni status using the following command:
 
sudo systemctl status patroni

If you look carefully at the bottom of the patroni status output, you will see that the (patroni1) is acting as leader node in the cluster:

Next, start patroni on subsequent nodes, (patroni2) for example, you will see (patroni2) is acting as secondary node in the cluster:

Start patroni on (patroni3), and it will also act as the secondary node in the cluster:


 

Install PgBouncer

Type following command on your Rocky Linux servers to install PgBouncer:

sudo dnf install -y pgbouncer

Make sure you repeat the same on each node before proceeding to next.

 

Configure PgBouncer Authentication

PgBouncer uses /etc/pgbouncer/userlist.txt file to authenticate database clients. You can write database credentials in userlist.txt file manually using the information from the pg_shadow catalog table, or you can create a function in database to allow a specific user to query for the current password of the database users.

Direct access to pg_shadow requires admin rights. It’s preferable to use a non-superuser that calls a SECURITY DEFINER function instead.

From your (patroni1), connect to the Postgres database as superuser, and create a security definer function:

psql -h patroni1 -p 5432 -U postgres

Execute following at postgres=# prompt:

CREATE ROLE pgbouncer LOGIN with encrypted password "Type_Your_Password_Here";

CREATE FUNCTION public.lookup (
INOUT p_user name,
OUT p_password text
) RETURNS record
LANGUAGE sql SECURITY DEFINER SET search_path = pg_catalog AS
$$SELECT usename, passwd FROM pg_shadow WHERE usename = p_user$$;

Make sure you  replace the highlighted text with yours.

Next, copy encrypted password of pgbouncer from pg_shadow catalog table:

select * from pg_shadow;

Type \q to exit from postgres=# prompt:

\q

Edit /etc/pgbouncer/userlist.txt file:

sudo nano /etc/pgbouncer/userlist.txt

Add pgbouncer credential like below:

"pgbouncer" "Type_Encrypted_Password_Here"

Save and close the editor when you are finished. 

Edit /etc/pgbouncer/pgbouncer.ini file, to make the required changes:
 
sudo cp -p /etc/pgbouncer/pgbouncer.ini /etc/pgbouncer/pgbouncer.ini.orig

sudo nano /etc/pgbouncer/pgbouncer.ini
Add your database in [databases] section like below:
 
* = host=192.168.10.1 port=5432 dbname=postgres
and change listen_addr=localhost to listen_addr=*
 
listen_addr = *

In the [pgbouncer] section, add following, but below to auth_file = /etc/pgbouncer/userlist.txt line:

auth_user = pgbouncer
auth_query = SELECT p_user, p_password FROM public.lookup($1)

Save and close the editor when you are finished.

From (patroni2), edit /etc/pgbouncer/userlist.txt file:

sudo nano /etc/pgbouncer/userlist.txt

Add pgbouncer credential like below:

"pgbouncer" "Type_Encrypted_Password_Here"

Save and close the editor when you are finished.

From (patroni2), edit /etc/pgbouncer/pgbouncer.ini file, to make the required changes:
 
sudo cp -p /etc/pgbouncer/pgbouncer.ini /etc/pgbouncer/pgbouncer.ini.orig

sudo nano /etc/pgbouncer/pgbouncer.ini
Add your database in [databases] section like below:
 
* = host=192.168.10.2 port=5432 dbname=postgres
and change listen_addr=localhost to listen_addr=*
 
listen_addr = *

In the [pgbouncer] section, add following, but below to auth_file = /etc/pgbouncer/userlist.txt line:

auth_user = pgbouncer
auth_query = SELECT p_user, p_password FROM public.lookup($1)

Save and close the editor when you are finished.

From (patroni3), edit /etc/pgbouncer/userlist.txt file:

sudo nano /etc/pgbouncer/userlist.txt

Add pgbouncer credential like below:

"pgbouncer" "Type_Encrypted_Password_Here"

Save and close the editor when you are finished.

From (patroni3), edit /etc/pgbouncer/pgbouncer.ini file, to make the required changes:
 
sudo cp -p /etc/pgbouncer/pgbouncer.ini /etc/pgbouncer/pgbouncer.ini.orig

sudo nano /etc/pgbouncer/pgbouncer.ini
Add your database in [databases] section like below:
 
* = host=192.168.10.3 port=5432 dbname=postgres
and change listen_addr=localhost to listen_addr=*
 
listen_addr = *

In the [pgbouncer] section, add following, but below to auth_file = /etc/pgbouncer/userlist.txt line:

auth_user = pgbouncer
auth_query = SELECT p_user, p_password FROM public.lookup($1)

Save and close the editor when you are finished.

When you are finished with pgbouncer configuration, execute below command to start pgbouncer on each node:

sudo systemctl start pgbouncer
 

Test PgBouncer Authentication

Make a connection to database using psql command via pgbouncer on port 6432:
 
psql -h patroni1 -p 6432 -U pgbouncer -d postgres

This should connect you to your Postgres database if everything was configured correctly as described above.

 

Install HAProxy

With patroni, you need a method to connect to the leader node regardless of which of the node in the cluster is the leader. HAProxy forwards the connection to whichever node is currently the leader. It does this using a REST endpoint that Patroni provides. 
 
Patroni ensures that, at any given time, only the leader node will appear as online, forcing HAProxy to connect to the correct node. Users or applications, (psql) for example, will connect to haproxy, and haproxy will make sure connecting to the back-end database leader node in the cluster.
 
Type following command to install haproxy on your Rocky Linux servers:
 
sudo dnf install -y haproxy
Make sure you repeat the same on each node before proceeding to next.
 

Configure HAProxy

Edit haproxy.cfg file on your first node (patroni1) in our case:
 
sudo mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.orig
sudo nano /etc/haproxy/haproxy.cfg

Add following configuration:

global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 1000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats

defaults
mode tcp
log global
option tcplog
retries 3
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout check 10s
maxconn 900


listen stats
mode http
bind *:7000
stats enable
stats uri /

listen primary
bind 192.168.10.200:5000
option httpchk OPTIONS /master
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server patroni1 192.168.10.1:6432 maxconn 100 check port 8008
server patroni2 192.168.10.2:6432 maxconn 100 check port 8008
server patroni3 192.168.10.3:6432 maxconn 100 check port 8008

listen standby
bind 192.168.10.200:5001
balance roundrobin
option httpchk OPTIONS /replica
http-check expect status 200
default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
server patroni1 192.168.10.1:6432 maxconn 100 check port 8008
server patroni2 192.168.10.2:6432 maxconn 100 check port 8008
server patroni3 192.168.10.3:6432 maxconn 100 check port 8008

Make sure you replace the highlighted text with yours. Save and close the editor when you are finished.
 
There are two important listen sections in haproxy configuration: 
  1. primary using port 5000 for reads/writes requests to back-end database leader node.
  2. standby using port 5001 only for reads-requests to back-end database replica nodes.
All three nodes are included in both sections: that is because all the database servers are potential candidates to be either primary or replica. Patroni provides a built-in REST API support for health check monitoring that works perfectly with HAProxy. HAProxy will send an HTTP request to port 8008 of patroni to know which role each node currently has.

The haproxy configuration will remain same on each node, so make sure you repeat the same configuration on your remaining nodes before proceeding to next.

 

Install Keepalived

Type following command to install keepalived on your Rocky Linux servers:

sudo dnf install -y keepalived
Make sure you repeat the same on each node before proceeding to next.
 

Configure Keepalived

Edit the /etc/sysctl.conf file on your (patroni1) in our case, to allow the server to bind to the virtual IP Address.
 
sudo nano /etc/sysctl.conf
Add following at the end of the file:
 
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
Save and close the editor when you are finished.

Type following command on your Linux server to reload settings from config file without rebooting:

sudo sysctl --system
sudo sysctl -p
Make sure you repeat the same on each node before proceeding to next.

Create /etc/keepalived/keepalived.conf file on your first node (patroni1) in our case, to add the required configuration:

sudo nano /etc/keepalived/keepalived.conf 
Add following configuration:
 
vrrp_script chk_haproxy {
script "pkill -0 haproxy"
interval 5
weight -4
fall 2
rise 1
}

vrrp_script chk_lb {
script "pkill -0 keepalived"
interval 1
weight 6
fall 2
rise 1
}

vrrp_script chk_servers {
script "echo 'GET /are-you-ok' | nc 127.0.0.1 7000 | grep -q '200 OK'"
interval 2
weight 2
fall 2
rise 2
}

vrrp_instance vrrp_1 {
interface enp0s3
state MASTER
virtual_router_id 51
priority 101
virtual_ipaddress_excluded {
192.168.10.200
}
track_interface {
enp0s3 weight -2
}
track_script {
chk_haproxy
chk_lb
}
}

Do not forget to replace the highlighted text with yours. Save and close the editor when you are finished. 

Create /etc/keepalived/keepalived.conf file on your second node (patroni2) in our case:
 
sudo nano /etc/keepalived/keepalived.conf 
Add following configuration:
 
vrrp_script chk_haproxy {
script "pkill -0 haproxy"
interval 5
weight -4
fall 2
rise 1
}

vrrp_script chk_lb {
script "pkill -0 keepalived"
interval 1
weight 6
fall 2
rise 1
}

vrrp_script chk_servers {
script "echo 'GET /are-you-ok' | nc 127.0.0.1 7000 | grep -q '200 OK'"
interval 2
weight 2
fall 2
rise 2
}

vrrp_instance vrrp_1 {
interface enp0s3
state BACKUP
virtual_router_id 51
priority 100
virtual_ipaddress_excluded {
192.168.10.200
}
track_interface {
enp0s3 weight -2
}
track_script {
chk_haproxy
chk_lb
}
}

Do not forget to replace the highlighted text with yours. Save and close the editor when you are finished.
 
Create /etc/keepalived/keepalived.conf file on your third node (patroni3) in our case:
 
sudo nano /etc/keepalived/keepalived.conf 
Add following configuration:
 
vrrp_script chk_haproxy {
script "pkill -0 haproxy"
interval 5
weight -4
fall 2
rise 1
}

vrrp_script chk_lb {
script "pkill -0 keepalived"
interval 1
weight 6
fall 2
rise 1
}

vrrp_script chk_servers {
script "echo 'GET /are-you-ok' | nc 127.0.0.1 7000 | grep -q '200 OK'"
interval 2
weight 2
fall 2
rise 2
}

vrrp_instance vrrp_1 {
interface enp0s3
state BACKUP
virtual_router_id 51
priority 99
virtual_ipaddress_excluded {
192.168.10.200
}
track_interface {
enp0s3 weight -2
}
track_script {
chk_haproxy
chk_lb
}
}

Do not forget to replace the highlighted text with yours. Save and close the editor when you are finished.
 
Type below command on your first node (patroni1) to start keepalived:
 
sudo systemctl start keepalived
Check on your (MASTER) node to see if your (enp0s3) network interface has configured with an additional shared IP (192.168.10.200):
 
ip addr show enp0s3
You will see the output similar to like below:
 

With this keepalived configuration, if MASTER haproxy node goes down for any reason, shared IP Address will automatically float to a BACKUP haproxy node, and connectivity to database clients will remain available.

Type following command to start HAProxy on each node: 

sudo systemctl start haproxy
Verify HAProxy status on each node:
 
sudo systemctl status haproxy
We have already taken care of HAProxy high availability with a shared IP Address using keepalived configuration earlier.

You can manually test HAProxy failover scenario by killing haproxy on your MASTER  node with sudo systemctl stop haproxy command, you will see, within few seconds of delay, shared IP Address (192.168.10.200) will automatically float to your BACKUP haproxy node.

 

Test Patroni Cluster

You can test your Patroni cluster by initiating a database connection request from any of your applications (psql) for example using your (shared_ip:port):
 
psql -h 192.168.10.200 -p 5000 -U postgres
If everything was configured properly as described in the tutorial, this will connect you to your master database as you can see in the image below.
 

Next, execute two reads-requests to verify HAProxy round-robin load balancing mechanism is working as intended:
 
psql -h 192.168.10.200 -p 5001 -U postgres -t -c "select inet_server_addr()"
This should return your replica node IP as shown in image below:
 
 
Execute the same read-request second time:
 
psql -h 192.168.10.200 -p 5001 -U postgres -t -c "select inet_server_addr()"
This should return your other replica node IP as shown in image below:
 

Next, execute the same command again but this time using port 5000:
 
psql -h 192.168.10.200 -p 5000 -U postgres -t -c "select inet_server_addr()"
This should return your primary (leader) node IP as shown in image below:


You can access HAProxy dashboard by typing http://shared-ip:7000/ in your browser address bar.
 

As you can see, in the primary section (patroni1) row is highlighted in green. This indicates that 192.168.10.1 is currently a leader node in the cluster.
 
In the standby section, the (patroni2, patroni3) row is highlighted as green. This indicates that both nodes are replica node in the cluster.
 

If you kill the leader node using (sudo systemctl stop patroni) or by completely shutting down the server, the dashboard will look similar to like below:

As you can see, in the primary section (patroni2) row is now highlighted in green. This indicates that 192.168.10.2 is currently a leader node in the cluster.


Please note that, in this particular scenario, it just so happens that the second node in the cluster is promoted to leader. This might not always be the case and it is equally likely that the 3rd node may be promoted to leader.
 

Test Postgres Database Replication

We will create a test database to see if it is replicated to other nodes in the cluster. For this guide, we will use (psql) to connect to database via haproxy like below: 

psql -h 192.168.10.200 -p 5000 -U postgres
From the Postgres prompt, create a test database like below:
 
create database testdb;
create user testuser with encrypted password 'mystrongpass';
grant all privileges on database testdb to testuser;

\q
Update your userlist.txt file on each node for testuser as explained in PgBouncer section.
 
Stop patroni on leader node (patroni1) in our case with below command:
 
sudo systemctl stop patroni
Connect to database using psql, and this time haproxy will automatically make connection to  whichever node is currently leader in the cluster: 

psql -h 192.168.10.200 -p 5000 testuser -d testdb
As you can see in the output below, connection to testdb was successful via haproxy:
 

Now bring up your first node with (sudo systemctl start patroni), and it will automatically rejoin the cluster as secondary and automatically synchronize with the leader.
 

Patroni Cluster Failover

With patronictl command you can administer, manage and troubleshoot your patroni cluster.
 
Type following command to list the options and commands you can use with patronictl: 

patronictl --help
This will show you the options and commands you can use with patronictl.

Options:
-c, --config-file TEXT Configuration file
-d, --dcs TEXT Use this DCS
-k, --insecure Allow connections to SSL sites without certs
--help Show this message and exit.

Commands:
configure Create configuration file
dsn Generate a dsn for the provided member, defaults to a dsn of...
edit-config Edit cluster configuration
failover Failover to a replica
flush Discard scheduled events (restarts only currently)
history Show the history of failovers/switchovers
list List the Patroni members for a given Patroni
pause Disable auto failover
query Query a Patroni PostgreSQL member
reinit Reinitialize cluster member
reload Reload cluster member configuration
remove Remove cluster from DCS
restart Restart cluster member
resume Resume auto failover
scaffold Create a structure for the cluster in DCS
show-config Show cluster configuration
switchover Switchover to a replica
version Output version of patronictl command or a running Patroni
You can check patroni cluster member nodes using the following command:
 
patronictl -c /etc/patroni/patroni.yml list


You can check patroni cluster failover and switchover history using the following command:

patronictl -c /etc/patroni/patroni.yml history

In patroni cluster, failover is executed automatically when a leader node getting unavailable for unplanned reason. If you want to test your patroni cluster failover manually, you can initiate failover to a replica node using the following command:

patronictl -c /etc/patroni/patroni.yml failover

 

Disable Patroni Cluster Auto Failover

In some cases, it is necessary to perform maintenance task on a single node, such as applying patches or release updates. When you disable auto failover, patroni won’t change the state of the cluster.

You can disable patroni cluster auto failover using the following command:

patronictl -c /etc/patroni/patroni.yml pause

 

Patroni Cluster Switchover

There are two possibilities to run a switchover, either in scheduled mode or immediately. At the given time, the switchover will take place, and you will see an entry of switchover activity in the log file.

You can test patroni cluster switchover using the following command:

patronictl -c /etc/patroni/patroni.yml switchover --master your_leader_node --candidate your_replica_node

If you go with [now] option, switchover will take place immediately.

 

Simulate Patroni Cluster Failure Scenario

To simulate failure scenarios in production environment, we will execute continuous reads and writes to the database using a simple Python script as we are interested in observing the state of the cluster upon a server failure.

You should have a workstation (virtual of physical) installed with any of your favorite Linux distribution such as CentOS, RHEL, Fedora, Rocky, etc with PostgreSQL client, and PostgreSQL driver for Python installed.

Type following command to install PostgreSQL client, and driver for python on your Linux workstation:

sudo dnf install -y postgresql python3-psycopg2
Download HAtester.py script on your Linux workstation:

cd ~

git clone https://github.com/manwerjalil/pgscripts.git

chmod +x ~/pgscripts/pgsqlhatest.py
Edit pgsqlhatest.py and replace database credentials with yours:

nano ~/pgscripts/pgsqlhatest.py
Replace following highlighted text with yours:
 
# CONNECTION DETAILS
host = "Type_HAProxy_IP_Here"
dbname = "postgres"
user = "postgres"
password = "Type_postgres_password_Here"
Save and close the editor when you are finished.
 
You need to create a target table "PGSQLHATEST" in your database:
 
psql -h 192.168.10.1 -p 5432 -U postgres -c "CREATE TABLE PGSQLHATEST (TM TIMESTAMP);"

psql -h 192.168.10.1 -p 5432 -U postgres -c "CREATE UNIQUE INDEX idx_pgsqlhatext ON pgsqlhatest (tm desc);"

To get the best result, we will use tmux terminal for multiple tabs to monitor state of the patroni cluster all together in real time:
 
 
On the left side of the screen, we have one ssh session open for each of the 3 nodes, continuously monitoring patroni cluster state:

On the right side of the screen, we are running python script sending writes through port 5000, and reads through port 5001 from our workstation:
 
 
To observe what happens with database traffic when the environment experiences a failure, we will manually stop patroni on our leader node using the following command:
 
sudo systemctl stop patroni
When we stopped patroni on leader node, within few seconds a replica node become leader, and continues writing to the database stopped for few seconds, then reconnected automatically.
 
There is no disconnection happens to continues reading to the database because replica node was available throughout.
 
 
Let's bring back our patroni node using following command:
 
sudo systemctl start patroni
The node has automatically rejoined the cluster as replica:
 
 

Test Environment Failure Scenarios

We leave it up to you to test and experiment with your patroni cluster to see what happens when environment experiences a failure such as:

  • Loss of network connectivity
  • Power breakdown

When simulating these tests, you should continuously monitor how the patroni cluster re-adjusts itself and how it affects read and write traffic for each failure scenario.
 

Conclusion

I hope this guide was helpful to set up a highly available PostgreSQL release 14 cluster on Rocky Linux 9 for your production use. We highly appreciate if you tell us about this tutorial in the comment section below.

4 comments:

  1. Nice documentation, but unfortunately it does not work. I am attempting to follow these instructions using RHEL 9.1 and etcd 3.5.7 with go1.17.13. Using the instructions provided I am able to start the etcd cluster, but am unable to get it's status. The error indicates that it is not listening on localhost:2379. I have confirmed this with netstat. This looks like it's been a historical problem with etcd going back to 2014, but haven't found anything how specifically how to correct the issue. Any help would be appreciated!

    ReplyDelete
    Replies
    1. When you are reporting specific problems related your set up, you must share error, logs, screenshots, etc to make it more understandable and to provide you workaround. Coming to your question that you not getting etcd status, you should check:
      1. Selinux
      2. Firewalld

      Delete
  2. It was wonderful and very practical

    ReplyDelete
  3. hi , after i cluster my postgresql , i create a zabbix database in my postgre and give schema to it , but now i can't connect my zabbix server to this cluster , however i can connect to my zabbix database by psql command , have you any solution?

    ReplyDelete

Powered by Blogger.