DRBD

From UGCS
(Difference between revisions)
Jump to: navigation, search
(New page: DRBD handles the disk stuff. It's kinda like raid1 but over a network. Sometimes the nodes may refuse to connect. If this happens, check 'dmesg' for the message drbd1: Split-Brain det...)
 
Line 30: Line 30:
 
  }
 
  }
 
[[User:Jdhutchin@ugcs.caltech.edu|Jdhutchin@ugcs.caltech.edu]] 22:08, 7 June 2008 (PDT)
 
[[User:Jdhutchin@ugcs.caltech.edu|Jdhutchin@ugcs.caltech.edu]] 22:08, 7 June 2008 (PDT)
 +
 +
[[Category:Sysadmin_Documentation]]

Revision as of 07:18, 10 June 2008

DRBD handles the disk stuff. It's kinda like raid1 but over a network.

Sometimes the nodes may refuse to connect. If this happens, check 'dmesg' for the message

drbd1: Split-Brain detected, dropping connection!

This means that at some point, both of them thought they were primary. This can cause possible FS corruption, so drbd says that a human has to do something about it. The best thing to do is to run

drbdadm invalidate <resource>

on the host that doesn't have the data you want (usually the backup one). You will then be able to re-connect the nodes, and they will resync.

If the drbd connection is on the same link that the heartbeat is, you will always have a split-brain when the network cable is pulled. This is why we have automatic split-brain recovery enabled- the one that was most recently primary is the one that is considered authoritative. You can specify this with

net {
    after-sb-0pri discard-older-primary;
    after-sb-1pri consensus;
}

in drbd.conf

After you change a config file, you can update the node with

drbdadm adjust <resource>

You can then check to see if it worked (or to see other options if you're curious) with

drbdsetup /dev/drbd# show

Here are my preferred network options:

net {
    after-sb-0pri discard-older-primary;
    after-sb-1pri consensus;
    always-asbp;
    timeout 30;
    connect-int 5;
    ping-int 5;
}

Jdhutchin@ugcs.caltech.edu 22:08, 7 June 2008 (PDT)

Personal tools