Pacemaker

起搏器不進行故障轉移

  • March 10, 2017
node $id="10" db10 \
   attributes standby="off"
node $id="9" db09 \
   attributes standby="off"
primitive drbd_jenkins ocf:linbit:drbd \
   params drbd_resource="r0" \
   op start interval="0s" timeout="60s" \
   op stop interval="0s" timeout="60s"
primitive jenkins lsb:jenkins \
   op monitor interval="15s" \
   op start interval="0s" timeout="90s"
primitive mount_jenkins ocf:heartbeat:Filesystem \
   params device="/dev/drbd0" directory="/var/lib/jenkins/" fstype="ext4" \
   op start timeout="20s" interval="0" \
   op stop timeout="20s" interval="0"
primitive vip-158 ocf:heartbeat:IPaddr2 \
   params ip="x.x.x.158" nic="eth0" cidr_netmask="28" \
   op start interval="0s" timeout="60s" \
   op monitor interval="5s" timeout="20s" \
   op stop interval="0s" timeout="60s" \
   meta target-role="Started"
group jenkins_group jenkins vip-158 mount_jenkins
ms ms_drbd_jenkins drbd_jenkins \
   meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" globally-unique="false" target-role="Master"
colocation drbd_mount inf: ms_drbd_jenkins:Master jenkins_group
order mount_after_drbd inf: ms_drbd_jenkins:promote jenkins_group:start
property $id="cib-bootstrap-options" \
   dc-version="1.1.10-42f2063" \
   cluster-infrastructure="corosync" \
   stonith-enabled="false" \
   last-lrm-refresh="1489005751"
rsc_defaults $id="rsc-options" \
   resource-stickiness="0"

當起搏器啟動時,一切正常:

root@db09:~# crm status
Last updated: Wed Mar  8 21:20:33 2017
Last change: Wed Mar  8 21:15:15 2017 via crm_resource on db10
Stack: corosync
Current DC: db10 (10) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
5 Resources configured



Online: [ db09 db10 ]

Master/Slave Set: ms_drbd_jenkins [drbd_jenkins]
    Masters: [ db09 ]
    Slaves: [ db10 ]
Resource Group: jenkins_group
    jenkins    (lsb:jenkins):  Started db09 
    vip-158    (ocf::heartbeat:IPaddr2):   Started db09 
    mount_jenkins  (ocf::heartbeat:Filesystem):    Started db09

但我不能將 master 移動到 db10,無論是:

crm_resource --resource ms_drbd_jenkins --move --node db10

或者

crm resource migrate ms_drbd_jenkins db10

最糟糕的是,如果我設置 db09 節點備用,兩者都成為從站:

root@db09:~# crm node standby db09
root@db09:~# crm status
Last updated: Wed Mar  8 21:27:26 2017
Last change: Wed Mar  8 21:27:24 2017 via crm_attribute on db09
Stack: corosync
Current DC: db10 (10) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
5 Resources configured


Node db09 (9): standby
Online: [ db10 ]

Master/Slave Set: ms_drbd_jenkins [drbd_jenkins]
    Slaves: [ db09 db10 ]

如果 db10 進入備用狀態,它將停止,這是預期的:

root@db09:~# crm node standby db10
root@db09:~# crm status
Last updated: Wed Mar  8 21:28:45 2017
Last change: Wed Mar  8 21:28:44 2017 via crm_attribute on db09
Stack: corosync
Current DC: db10 (10) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
5 Resources configured


Node db10 (10): standby
Online: [ db09 ]

Master/Slave Set: ms_drbd_jenkins [drbd_jenkins]
    Masters: [ db09 ]
    Stopped: [ db10 ]
Resource Group: jenkins_group
    jenkins    (lsb:jenkins):  Started db09 
    vip-158    (ocf::heartbeat:IPaddr2):   Started db09 
    mount_jenkins  (ocf::heartbeat:Filesystem):    Started db09 

我在這裡做錯了什麼?

您的託管約束不正確。您告訴集群 DRBD 必須是啟動 jenkins_group 的 Master。

請改用以下約束:

colocation cl_jenkins-with-drbd inf: jenkins_group ms_drbd_jenkins:Master
order o_drbd-before-jenkins inf: ms_drbd_jenkins:promote jenkins_group:start

專業提示:注意約束名稱中的“語言”:cl____-with-____, o____-before-____. inf:這與評分後的資源名稱相匹配。如果您遵循約束名稱中的withbefore命名約定,它們將變得更容易閱讀/管理/故障排除。

引用自:https://unix.stackexchange.com/questions/350094