Gentoo - PostgreSQL replication (native)
By: John McFarlane
<john.mcfarlane@rockfloat.com>
This document was originally created on 02/14/2007
Last updated:
02/14/2007 @ 10:00
Abstract:
This document will go thru a step by step implementation of native PostgreSQL
replication.
Table of Contents:
- Introduction
- Create a directory to hold the configuration
- Installation
- Create the master/slave databases
- Configure the master for replication
- Configure the slave(s) for replication
- Final magic to bring it all together
- Startup replication
- Test to make sure it's really working
- How to replicate accross a network
- Changelog
- Conventions used
1. Introduction
Database replication is awesome. I've heard some describe
replication as a nightmare that's only used when the database
wasn't designed properly in the first place. I think that's
rubbish. Before you move on I really recommend that you read up
on
replication
and more specifically
database replication
.
Be sure to take note that replication is very different than
clustering. Replication is a master-slave
technology while true clustering is multi-master. For the time
being, if you really need clustering you should prolly look towards
Oracle Rac.
I'm finished with this step
This tutorial assumes that you want to explore PostgreSQL's native implementation of replication, which is probably one of the easier to understand and implement. I belive it was a project named "DBMirror" that was moved into the official PostgreSQL Contrib beginning with 7.3.1. There are all sorts of more sophisticated replication packages out there, if you need something more feature rich:
- Slony-1
- PGreplication
- PGCluster
- PGPool
- DBBalancer
- Open Minds (Commercial)
2. Create a directory to hold the configuration
This howto will have you create a few files. Some are only used and
then can be deleted, a few of them are actually used as config files
for running the replication.
I'm finished with this step
user# mkdir cluster # Make this wherever you want it
user# cd cluster
3. Installation
Because the replication features are actually part of PostgreSQL
all you really need is to make sure you have postgresql
and the PostgreSQL Perl module pgperl. For us Gentoo cats this
means:
I'm finished with this step
root# emerge --sync
root# emerge -a postgresql #Answer "N" if you already have the latest
root# emerge -a pgperl
It's important to know the location of a few files as we'll need them
later. Here's a list of what you'll need:
user# equery files postgresql | grep \
-e DBMirror \
-e MirrorSetup \
-e AddTrigger \
-e slaveDatabase \
-e pending.so
# Should return a list like this:
/usr/bin/DBMirror.pl
/usr/lib/postgresql/pending.so # This is what provides the love
/usr/share/postgresql/contrib/AddTrigger.sql
/usr/share/postgresql/contrib/MirrorSetup.sql
/usr/share/postgresql/contrib/slaveDatabase.conf
I will always use full paths when referring to the above files, but it
helps to know that the postgresql install actually installed them.
4. Create the master/slave databases
First you need to create two databases, then you need to populate
their schemas.
I'm finished with this step
-
user# createdb -U postgres -E UTF-8 -O postgres cluster-master user# createdb -U postgres -E UTF-8 -O postgres cluster-slave01 user# createdb -U postgres -E UTF-8 -O postgres cluster-slave02 -
Create a simple schema file for use with populating all three:
user# nano -w cluster.sql # Make the file have this content: CREATE TABLE car( uid SERIAL PRIMARY KEY, name VARCHAR, make VARCHAR, model VARCHAR ); -
Populate the schema for all of the clustered db(s):
user# psql -U postgres cluster-master < cluster.sql user# psql -U postgres cluster-slave01 < cluster.sql user# psql -U postgres cluster-slave02 < cluster.sql
5. Configure the master for replication
Now we need to update the schema of the master so it can understand
the types of questions the slaves will be asking of it:
I'm finished with this step
user# psql -U postgres cluster-master < /usr/share/postgresql/contrib/MirrorSetup.sql
The master also needs a trigger that does the work of exposing
the desired tables to the slaves. This means you need one
trigger per table that you want replicated. This makes it easy
to only replicate the certain tables. This makes it hard if you
want to replicate everything, deal with it :)
user# cp /usr/share/postgresql/contrib/AddTrigger.sql AddTrigger.sql
user# nano -w AddTrigger.sql
# Make it look like this
-- Adjust this setting to control where the objects get created.
SET search_path = public;
CREATE TRIGGER "mirror_car"
AFTER INSERT OR DELETE OR UPDATE ON "car"
FOR EACH ROW EXECUTE PROCEDURE "recordchange" ();
--Uncomment this if you also want to replicate the "boat" table:
--CREATE TRIGGER "mirror_boat"
--AFTER INSERT OR DELETE OR UPDATE ON "boat"
--FOR EACH ROW EXECUTE PROCEDURE "recordchange" ();
Now we need to actually create the trigger(s), this step is similar
to when we populated the schema;
user# psql -U postgres cluster-master < AddTrigger.sql
From a schema perspective, the master is now ready to replicate.
6. Configure the slave(s) for replication
The slave configuration is a bit different. You don't change the
schema, you simply create a text file per slave. We will make one
file per slave:
I'm finished with this step
user# cp /usr/share/postgresql/contrib/slaveDatabase.conf slave01.conf
user# cp /usr/share/postgresql/contrib/slaveDatabase.conf slave02.conf
You'll now need to edit both files to specify the details of each
slave. Here's what slave01 should look like, you can model slave02.conf
after it:
#########################################################################
# Config file for DBMirror.pl
# This file contains a sample configuration file for DBMirror.pl
# It contains configuration information to mirror data from
# the master database to a single slave system.
#
# $PostgreSQL: pgsql/contrib/dbmirror/slaveDatabase.conf,v 1.3 2004/09/10 04:31:06 neilc Exp $
#######################################################################
$masterHost = "localhost";
$masterDb = "cluster-master";
$masterUser = "postgres";
$masterPassword = "postgrespassword";
# Where to email Error messages to
# $errorEmailAddr = "me@mydomain.com";
$slaveInfo->{"slaveName"} = "db-slave01";
$slaveInfo->{"slaveHost"} = "localhost";
$slaveInfo->{"slaveDb"} = "cluster-slave01";
$slaveInfo->{"slavePort"} = 5432;
$slaveInfo->{"slaveUser"} = "postgres";
$slaveInfo->{"slavePassword"} = "postgrespassword";
# If uncommented then text files with SQL statements are generated instead
# of connecting to the slave database directly.
# slaveDb should then be commented out.
# $slaveInfo{"TransactionFileDirectory"} = '/tmp';
#
# The number of seconds dbmirror should sleep for between checking to see
# if more data is ready to be mirrored.
$sleepInterval = 60;
7. Final magic to bring it all together
Currently the master and slave(s) all support replication, yet the
master doesn't know who it's supposed to replicate to. We
need to essentially tell the master who her children are:
I'm finished with this step
user# psql -U postgres cluster-master
-- We're now in the psql subsystem
cluster-master=# INSERT INTO dbmirror_mirrorhost VALUES (1, 'db-slave01');
INSERT 0 1
cluster-master=# INSERT INTO dbmirror_mirrorhost VALUES (2, 'db-slave02');
INSERT 0 1
# Exit the psql subsystem:
cluster-master=# \q
8. Startup replication
Ok, here we go. We are going to fire up two replication services,
one for each slave:
I'm finished with this step
user# /usr/bin/DBMirror.pl slave01.conf >/dev/null 2>/dev/null &
user# /usr/bin/DBMirror.pl slave02.conf >/dev/null 2>/dev/null &
Now the slaves will check every 60 seconds to see if there is anything
they should know about. You can update this time in slave0x.conf.
9. Test to make sure it's really working
We will perform all of this testing by doing database queries. Since
we'll need to jump around from db to db, I'll do things in an awkward
way that happens to lend it self to cutting and pasting. You do like
to cut and paste don't you?
I'm finished with this step
# Validate all three db's are in fact empty:
user# psql -U postgres cluster-master -c 'SELECT * FROM car;'
uid | name | make | model
-----+------+------+-------
(0 rows)
user# psql -U postgres cluster-slave01 -c 'SELECT * FROM car;'
uid | name | make | model
-----+------+------+-------
(0 rows)
user# psql -U postgres cluster-slave02 -c 'SELECT * FROM car;'
uid | name | make | model
-----+------+------+-------
(0 rows)
# Insert a new car into our master database:
user# psql -U postgres cluster-master -c "INSERT INTO car(name, make, model) VALUES('foo', 'Honda', 'Accord');"
INSERT 0 1
# See what's in the first slave:
user# psql -U postgres cluster-slave01 -c 'SELECT * FROM car;'
uid | name | make | model
-----+------+-------+--------
1 | foo | Honda | Accord
(1 row)
# Sweet mother!, is it in slave02 too?:
user# psql -U postgres cluster-slave02 -c 'SELECT * FROM car;'
uid | name | make | model
-----+------+-------+--------
1 | foo | Honda | Accord
(1 row)
Well, there you have it. One master database that's replicating to
two cute little children. Good job, get yourself a cookie.
10. How to replicate accross a network
This is pretty easy, just update the slave0x.conf files to specify
the actual hostname on the network (not localhost) and make the
corresponding change(s) when you run the INSERT statements. You also
need to run the slave commands on... the slaves :)
I'm finished with this step
When I get time I'll flesh this out in more detail
This document was originally created on 02/14/2007
Conventions and tips for this howto document:
- This howto is intended for >=Postgres-7.3.1
- This howto assumes you have portage-utils installed
- This howto assumes your db superuser is: postgres
- This howto assumes you have postgres configured to support TCP/IP
Disclaimer:
This page is not endorsed by gentoo.org or any other cool
cats. Any information provided in this document is to be used
at your own risk.