Pacemaker
起搏器不進行故障轉移
node $id="10" db10 \ attributes standby="off" node $id="9" db09 \ attributes standby="off" primitive drbd_jenkins ocf:linbit:drbd \ params drbd_resource="r0" \ op start interval="0s" timeout="60s" \ op stop interval="0s" timeout="60s" primitive jenkins lsb:jenkins \ op monitor interval="15s" \ op start interval="0s" timeout="90s" primitive mount_jenkins ocf:heartbeat:Filesystem \ params device="/dev/drbd0" directory="/var/lib/jenkins/" fstype="ext4" \ op start timeout="20s" interval="0" \ op stop timeout="20s" interval="0" primitive vip-158 ocf:heartbeat:IPaddr2 \ params ip="x.x.x.158" nic="eth0" cidr_netmask="28" \ op start interval="0s" timeout="60s" \ op monitor interval="5s" timeout="20s" \ op stop interval="0s" timeout="60s" \ meta target-role="Started" group jenkins_group jenkins vip-158 mount_jenkins ms ms_drbd_jenkins drbd_jenkins \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" globally-unique="false" target-role="Master" colocation drbd_mount inf: ms_drbd_jenkins:Master jenkins_group order mount_after_drbd inf: ms_drbd_jenkins:promote jenkins_group:start property $id="cib-bootstrap-options" \ dc-version="1.1.10-42f2063" \ cluster-infrastructure="corosync" \ stonith-enabled="false" \ last-lrm-refresh="1489005751" rsc_defaults $id="rsc-options" \ resource-stickiness="0"
當起搏器啟動時,一切正常:
root@db09:~# crm status Last updated: Wed Mar 8 21:20:33 2017 Last change: Wed Mar 8 21:15:15 2017 via crm_resource on db10 Stack: corosync Current DC: db10 (10) - partition with quorum Version: 1.1.10-42f2063 2 Nodes configured 5 Resources configured Online: [ db09 db10 ] Master/Slave Set: ms_drbd_jenkins [drbd_jenkins] Masters: [ db09 ] Slaves: [ db10 ] Resource Group: jenkins_group jenkins (lsb:jenkins): Started db09 vip-158 (ocf::heartbeat:IPaddr2): Started db09 mount_jenkins (ocf::heartbeat:Filesystem): Started db09
但我不能將 master 移動到 db10,無論是:
crm_resource --resource ms_drbd_jenkins --move --node db10
或者
crm resource migrate ms_drbd_jenkins db10
最糟糕的是,如果我設置 db09 節點備用,兩者都成為從站:
root@db09:~# crm node standby db09 root@db09:~# crm status Last updated: Wed Mar 8 21:27:26 2017 Last change: Wed Mar 8 21:27:24 2017 via crm_attribute on db09 Stack: corosync Current DC: db10 (10) - partition with quorum Version: 1.1.10-42f2063 2 Nodes configured 5 Resources configured Node db09 (9): standby Online: [ db10 ] Master/Slave Set: ms_drbd_jenkins [drbd_jenkins] Slaves: [ db09 db10 ]
如果 db10 進入備用狀態,它將停止,這是預期的:
root@db09:~# crm node standby db10 root@db09:~# crm status Last updated: Wed Mar 8 21:28:45 2017 Last change: Wed Mar 8 21:28:44 2017 via crm_attribute on db09 Stack: corosync Current DC: db10 (10) - partition with quorum Version: 1.1.10-42f2063 2 Nodes configured 5 Resources configured Node db10 (10): standby Online: [ db09 ] Master/Slave Set: ms_drbd_jenkins [drbd_jenkins] Masters: [ db09 ] Stopped: [ db10 ] Resource Group: jenkins_group jenkins (lsb:jenkins): Started db09 vip-158 (ocf::heartbeat:IPaddr2): Started db09 mount_jenkins (ocf::heartbeat:Filesystem): Started db09
我在這裡做錯了什麼?
您的託管約束不正確。您告訴集群 DRBD 必須是啟動 jenkins_group 的 Master。
請改用以下約束:
colocation cl_jenkins-with-drbd inf: jenkins_group ms_drbd_jenkins:Master order o_drbd-before-jenkins inf: ms_drbd_jenkins:promote jenkins_group:start
專業提示:注意約束名稱中的“語言”:
cl____-with-____
,o____-before-____
.inf:
這與評分後的資源名稱相匹配。如果您遵循約束名稱中的with
和before
命名約定,它們將變得更容易閱讀/管理/故障排除。