DRBD and NFS

Hey all. Sorry for the delay in posting this tutorial, I’ve been pretty busy and I finally had some time to finish it. Enjoy :).

Well, as you may know, in previous posts (Post 1, Post 2) I’ve showed you how to install and configure DRBD in an active/passive configuration with failover, automatically, using Heartbeat. Now, I’m going to show you how to use that configuration to export the data stored in the DRBD device (make it available for other servers in a network) using NFS.

So, the first thing to do is to install NFS in both drbd1 and drbd2, stop the daemon, and remove it from the upstart process (This means that we’ll have to remove the NFS daemon from starting during the boot up process). We do this as follows in both servers:

:~$ sudo apt-get install nfs-kernel-server nfs-common
:~$ sudo /etc/init.d/nfs-kernel-server stop
:~$ sudo update-rc.d -f nfs-kernel-server remove
:~$ sudo update-rc.d -f nfs-common remove

Now, you may wonder how is NFS going to work. The NFS daemon will be working only in the active (or primary) server, and this is going to be controlled by Heartbeat. But, since NFS stores information in /var/lib/nfs, on each server, we have to make both servers have the same information. This is because if drbd1 goes down, drbd2 will take over, but its information in /var/lib/nfs will be different from drbd1‘s info, and this will stop NFS from working. So, to make both servers have the same information in /var/lib/nfs, we are going to copy this information to the DRBD device and create a symbolic link, so that the information gets stored in the DRBD device and not in the actual disk. This way, both servers will have the same information. To do this, we do as follows in the primary server (drbd1):

mv /var/lib/nfs/ /data/
ln -s /data/nfs/ /var/lib/nfs
mkdir /data/export

After that, since we already copied the NFS lib files to the DRBD device in the primary server (drbd1), we have to remove them from the secondary server (drbd2) and create the link.

rm -rf /var/lib/nfs/
ln -s /data/nfs/ /var/lib/nfs

Now, since Heartbeat is going to control the NFS daemon, we have to tell Heartbeat to start the nfs-kernel-server daemon whenever it takes the control of the server. We do this in /etc/ha.d/haresources and we add nfs-kernel-server at the end. The file should look like this:

drbd1 IPaddr::172.16.0.135/24/eth0 drbddisk::testing Filesystem::/dev/drbd0::/data::ext3 nfs-kernel-server

Now that we’ve configured everything, we have to power off both servers, first the secondary and then the primary. Then we start the primary server, and during the boot up process we’ll see a message that will require us to type “yes” (This is the same message showed during the installation of DRBD in my first post).  After confirming, and If you have stonith configured, it is probable that drbd1 wont start its DRBD device, so it will remain as secondary, and won’t be able to mount it. This is because we will have to tell stonith to take over the service (To see if stonith is the problem, we can take a look at /var/log/ha-log). So, to do this, we do as follows in the primary server (drbd1):

meatclient -c drbd2

After doing this, we have to confirm. After the confirmation, Heartbeat will take the control, change the DRBD device to primary, and start NFS. Then, we can boot up the secondary server (drbd2). Enojy :-).

Note: I made the choice of powering off both servers. You could just restart them, one at a time, and see what happens :).

Advertisements

Installing DRBD on Hardy Part 2.

As you know, in a previous post I showed how to install DRBD in Hardy Heron, in an active/passive configuration. Now, I’m gonna show you how to install and configure Heartbeat to automatically monitor this active/passive configuration, and provide High Availability. This means I’ll show you how to integrate DRBD in a simple Heartbeat V1 configuration, and as a plus, I’ll show you how to use the meatware software provided by STONITH.

To do this, we have to install Heartbeat and make changes in three files, which are /etc/ha.d/ha.cf, /etc/ha.d/haresources and /etc/ha.d/authkeys. First of all we install Heartbeat as follows, in both nodes:

sudo apt-get install heartbeat-2

After the installation is completed, the first file we need to configure, in both nodes, is /etc/ha.d/ha.cf as follows:

logfile /var/log/ha-log
keepalive 2
deadtime 30
udpport 695
bcast eth0
auto_failback off
node drbd1 drbd2

Note: Notice that the auto_failback option is in off. This means that if the drbd1 fails, drbd2 will take control over the service, and if drbd1 comes back online, drbd2 will not failback to drbd1 and drbd2 will remain as the active node.

Now, as you know this is an active/passive configuration, so we have to decide which node is going to be the primary and which node is going to be the secondary one, for the Heartbeat configuration. (If you have followed my previous post, the drbd1 node is going to be our primary node, and drbd2 will be our secondary node). We also have to consider here is where are we going to mount the DRBD resource in our filesystem, and which IP address is going to used as the VIP (The Virtual IP is going to be used to access a service, or a DRBD resource, over the network, since we are going to use DRBD for NFS and/or MySQL).

So, assuming drbd1 is the primary node, the VIP is 172.16.0.135 and we are going to mount the DRBD resource in /data (so create the directory in both nodes), we edit /etc/ha.d/haresources as follows, in both nodes:

drbd1  IPaddr::172.16.0.135/24/eth0 drbddisk::testing Filesystem::/dev/drbd0::/data::ext3

Note: Notice that we are specifying the DRBD resource with the drbddisk option.

Then, we have to edit the /etc/ha.d/authkeys file, which is going to be used by Heartbeat to authenticate with the other node. So, we edit it as follows in both nodes:

auth 3
3 md5 DesiredPassword

Finally, we change file permissions to this last file as follows:

sudo chmod 600 /etc/ha.d/authkeys

Now that we have both nodes configured, I recommend you to power off both nodes and boot the node we want to have as primary. In our case, it is drbd1. After booting up this node, we need to verify that Heartbeat has started the DRBD resource (we can see this with cat /proc/drbd) and mounted it in /data. If it has, start the secondary node and verify that it is the secondary one.

If everything has gone right, try the failover process powering off the primary node. You will notice that the node that had the DRBD resource as secondary, it’s now the primary one and it has control over the service. Also verify is the VIP address is working (should appear as eth0:0 issuing ifconfig).

After verifying everything is working as expected, it is always recommendable to have a Fencing device to ensure data integrity. This fencing device will prevent an Split-Brain condition. A well-known Fencing mechanism is known as STONITH (Shoot the Other Node in the Head). This mechanism will basically power off or reset a node which is supposed to be dead. This means that if drbd1 is supposed to be dead, drbd2 will take control of the service or, in this case, the DRBD resource. But, if drbd1 is not actually dead and drbd2 tries to take control over the shared DRBD resource, an Split-Brain condition will occur. So STONITH will ensure that drbd1 has been reseted or powered off so that drbd2 can take control of the DRBD resource.

To do this, there is an stonith package that is used to work with STONITH/Fencing devices in Heartbeat. But, since we don’t have a real Fencing device, we will use meatware. Meatware is a software provided by STONITH, that simulates the use of a STONITH/Fencing device, by not allowing the secondary node (drbd2) to take control over the shared resources If there has not been a confirmation that the primary node (drbd1) has been powered off or rebooted. This requires operator intervention. So to integrate the meatware software with Heartbeat we do as follows:

sudo apt-get install stonith

Then, we have to modify /etc/ha.d/ha.cf like this (in both nodes):

... [Output Ommitted]
auto_failback off
stonith_host drbd1 meatware drbd2
stonith_host drbd2 meatware drbd1

node drbd1 drbd2

Then we power off the primary node (in this case drbd1) and reboot secondary node (drbd2). After reboot, take a look at /var/log/ha-log until you see something like this:

ha-log

As you can see, STONITH is sending a message that says that we need to confirm that drbd1 has been rebooted so that drbd2 can take control over the service. So, take a look at /proc/drbd and you will see that the DRBD resource is still as secondary. So, we do as follows:

sudo meatclient -c drbd1

And we will see something like this:

meatclient

Now, we confirm that it has been rebooted and we can now see that drbd2 will take control of the resource by setting the DRBD resource as primary (cat /proc/drbd).

So we can now start drbd1. Everytime the primary node fails, the secondary node will display a message on /var/log/ha-log saying that we should confirm that the primary has been rebooted so that the secondary can take control of the service and become the new primary.

In a next post I’ll cover how to make NFS make use of DRBD. Any comments, suggestions, etc, you know where to find me :).