Fast failover configuration with drbd and heartbeat

Partitions configuration

As you can see on the picture below we have two similar partitions configuration on both nodes:

  • /dev/sda1 for system partition (/)
  • /dev/sda2 for our drbd partition mounted as /srv
  • /dev/sda3 for linux swap.

We need also IP addresses for hosts:

  • 172.22.2.61 - node1 will be our master host
  • 172.22.2.62 - node2 will be our slave host
  • 172.22.2.63 - floating address between node1 and node2

Preparations

To make it easier to understand, before writing all commands there will be info in round brackets on which node operation will be executed.
First of all we need to install drbd utils:

(both nodes): apt-get install drbd8-utils

Then install whole set of applications called heartbeat

(both nodes): apt-get install heartbeat

DRBD configuration

Leave in /etc/drbd.conf file only this line on both nodes:

include "drbd.d/global_common.conf";

Edit /etc/drbd.d/global_common.conf file on both nodes. Notice that node1 and node2 are names from both hosts using hostname -a command

global {
    usage-count no;
}
common {
  protocol C;
  syncer {
        rate 10M;
        al-extents 257;
 }
}
resource srv {
  startup {
    become-primary-on node1;
  }
  disk {
    on-io-error   detach;
  }
  net {
    cram-hmac-alg sha1;
    shared-secret "EdlerWuvyidhemorsob4";
  }
  on node1 {
    device    /dev/drbd0;
    disk      /dev/sda2;
    address   172.22.2.61:7788;
    meta-disk internal;
  }
  on node2 {
    device    /dev/drbd0;
    disk      /dev/sda2;
    address   172.22.2.62:7788;
    meta-disk internal;
  }
}

Load drbd module executing below commands:

(both nodes): modprobe drbd
(both nodes): /etc/init.d/drbd start

Now create new drbd resource and synchronize with second node:

(both nodes) drbdadm create-md srv
(both nodes) drbdadm attach srv
(both nodes) drbdadm connect srv
(node1)        drbdadm -- --overwrite-data-of-peer primary srv

Wait until drbd partition synchronizes with second node

(node1) cat /proc/drbd

After finish you should see both nodes primary and secondary with status UpToDate:

version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757 
 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----
    ns:164 nr:12 dw:176 dr:589 al:2 bm:2 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

Now it's time to create a filesystem:

(node1) mkfs.ext4 /dev/drbd0

For test of correctly running cluster we will use apache tomcat:

(node1) mount -t ext4 /dev/drbd0 /srv
(node1) cd /srv
(node1) wget http://www.apache.net.pl/tomcat/tomcat-7/v7.0.6/bin/apache-tomcat-7.0.6.tar.gz
(node1) tar zxvf apache-tomcat-7.0.6.tar.gz
(node1) ln -s apache-tomcat-7.0.6.tar.gz tomcat
(node1) cd && umount /srv

Heartbeat configuration

Files which have to be created on both nodes in /etc/ha.d directory are:

  • ha.cf
mcast eth0 239.0.0.6 694 1 0
keepalive 2
warntime 10
deadtime 15
initdead 120
node node1
node node2
respawn hacluster /usr/lib/heartbeat/ipfail
apiauth default uid=nobody gid=haclient
apiauth ipfail uid=hacluster
apiauth ping gid=nogroup uid=nobody,hacluster
auto_failback on
ping 172.22.2.1
deadping 15
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility     local0
  • authkeys - don't forget to change file mode to 600 (chmod 600 authkeys)
auth 1
1 sha1 NafWoavmedapFurGhas3
  • haresources
node1 IPaddr::172.22.2.63/24/eth0 drbddisk::srv Filesystem::/dev/drbd0::/srv::ext4 tomcat.sh
  • node1 - name of host which is working as master
  • IPaddr::172.22.2.63/24/eth0 - IP address that will be used between nodes configured on eth0 interface
  • drbddisk::srv - name of drbd resource which we created with: drbdadm create-md command
  • Filesystem::/dev/drbd0::/srv::ext4 - device name, mount path and type of filesystem
  • tomcat.sh - list of scripts that will be executed during heartbeat startup and shutdown (scripts should have implemented start and stop methods)

Example tomcat.sh script (to run it you need a java package which can be installed by running: apt-get install sun-java6-jdk)

#!/bin/bash
#
# tomcat
#
# chkconfig:
# description:  Start up the Tomcat servlet engine.

# Source function library.
#. /etc/init.d/functions
. /lib/lsb/init-functions

RETVAL=$?
CATALINA_HOME="/srv/tomcat"

function start(){
 if [ -f $CATALINA_HOME/bin/startup.sh ];
 then
  echo $"Starting Tomcat"
   $CATALINA_HOME/bin/startup.sh
 else
  echo $"Cannot find startup.sh script"
 fi
}

function stop(){
 if [ -f $CATALINA_HOME/bin/shutdown.sh ];
 then
  echo $"Stopping Tomcat"
   $CATALINA_HOME/bin/shutdown.sh
 else
  echo $"Cannot find shutdown.sh script"
 fi
}

case "$1" in
 start)
        start
        ;;
 stop)
        stop
        ;;
 restart)
        stop
        start
        ;;
 *)
        echo $"Usage: $0 {start|stop|restart}"
        exit 1
        ;;
esac

exit $RETVAL

It's time to run whole configuration. It can take around 120 seconds to start up both heartbeats so be patient. Observe log files for any problems.

(both nodes) /etc/init.d/heartbeat start

If everything is fine, /srv partition should be mounted and tomcat process should be running on node1.

Command which will help us identify role of actual drbd node - first one is connected with node where we run below command:

(node1) drbdadm role srv
Primary/Secondary

Let's make some tests now

Before we simulate node1 failure check if configuration is working properly.

First: if /srv partion is mounted:

(node1) mount  | grep srv
/dev/drbd0 on /srv type ext4 (rw)

Second: if IP address is assigned to proper interface:

(node1) ip a s | grep eth0
2: eth0:  mtu 1500 qdisc pfifo_fast state UP qlen 1000
    inet 172.22.2.61/24 brd 172.22.2.255 scope global eth0
    inet 172.22.2.63/24 brd 172.22.2.255 scope global secondary eth0:0

Third: if tomcat is running:

(node1) ps aux | grep tomcat
root   2896  0.9 37.7 258892 99116 ?        Sl   Jan21  14:29 /usr/bin/java -Djava.util.logging.config.file=/srv/tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -noverify -Djava.awt.headless=true -Xms64m -Xmx96m -Djava.endorsed.dirs=/srv/tomcat/endorsed -classpath :/srv/tomcat/bin/bootstrap.jar -Dcatalina.base=/srv/tomcat -Dcatalina.home=/srv/tomcat -Djava.io.tmpdir=/srv/tomcat/temp org.apache.catalina.startup.Bootstrap start

We are now ready to go. Shutdown host or simply stop heartbeat on node1. Observe logs on node2. Within few seconds IP address will be assinged to eth0 interface, /srv partition will be mounted and tomcat will start. Watch pings on 172.22.2.63 IP address. Few ICMP packages has been lost.

ping 172.22.2.63
PING 172.22.2.63 (172.22.2.63) 56(84) bytes of data.
64 bytes from 172.22.2.63: icmp_seq=1 ttl=64 time=0.204 ms
64 bytes from 172.22.2.63: icmp_seq=2 ttl=64 time=1.51 ms
64 bytes from 172.22.2.63: icmp_seq=3 ttl=64 time=0.155 ms
64 bytes from 172.22.2.63: icmp_seq=4 ttl=64 time=0.160 ms
64 bytes from 172.22.2.63: icmp_seq=5 ttl=64 time=0.251 ms
64 bytes from 172.22.2.63: icmp_seq=6 ttl=64 time=0.600 ms
64 bytes from 172.22.2.63: icmp_seq=9 ttl=64 time=0.267 ms
64 bytes from 172.22.2.63: icmp_seq=10 ttl=64 time=0.234 ms
64 bytes from 172.22.2.63: icmp_seq=11 ttl=64 time=0.416 ms
64 bytes from 172.22.2.63: icmp_seq=12 ttl=64 time=0.424 ms

Run again heartbeat on node1 and the configuration should return to previous state.

Few hints to problems which can occur

Problem.1. Create operation has been stopped due to existence of another file system used by that partition.

 
drbdadm create-md srv
md_offset 10997067776
al_offset 10997035008
bm_offset 10996699136

Found ext3 filesystem which uses 10739328 kB
current configuration leaves usable 10738964 kB

Device size would be truncated, which
would corrupt data and result in
'access beyond end of device' errors.
You need to either
   * use external meta data (recommended)
   * shrink that filesystem first
   * zero out the device (destroy the filesystem)
Operation refused.

Command 'drbdmeta /dev/drbd0 v08 /dev/md1 internal create-md' terminated with exit code 40
drbdadm aborting

Erase whole partition

dd if=/dev/zero of=/dev/sda2 bs=1M count=128

Problem.2. Slow transfer during synchronization between nodes.

drbdadm adjust srv