Pacemaker

最後更新: 2021-03-24

介紹

Pacemaker 1.1 with Corosync 2.x

 

目錄

  • Pacemaker 結構
  • Install Pacemaker
  • Perpare Cluster Member
  • Config Pacemaker Auth
  • 組成 Cluster
  • Two Node Cluster
  • CLI - pcs
  • Resource(Service)
  • Failover Test
  • Stickiness
  • List Resource Providers(Agent)
  • Resource Monitoring Operations
  • Resource - IPaddr2
  • Pacemaker with Apache
  • Move Resources Manually
  • Active-Passive drbd (clone)
  • MySQL with Pacemaker
  • Update Cluster Setting
  • Resource in manage mode
  • Failcount
  • Active-Active drbd (未完)
  • Clone the IP address
  • Tools: crm_mon
  • 同步 /etc 內的 config
  • Stonithd

 


Pacemaker 結構

 

Pacemaker itself is composed of five key components:

  • Cluster Information Base (CIB)
  • Cluster Resource Management daemon (CRMd)
  • Local Resource Management daemon (LRMd)
  • Policy Engine (PEngine or PE)
  • Fencing daemon (STONITHd)

Diagram

       PE
       |  /-> STONITHd
CIB - CRMd -> LRMd
----------------------
     Corosync

Corosync

Corosync Cluster Engine = Group Communication System

Features

  • A simple availability manager that restarts the application process when it has failed
  • A quorum system that notifies applications when quorum is achieved or lost

CIB(XML)

The CIB uses XML to represent both the cluster’s configuration and current state of all resources in the cluster.

The contents of the CIB are automatically kept in sync across the entire cluster and

  are used by the PEngine to compute the ideal state of the cluster and how it should be achieved.

---

This list of instructions is then fed to the Designated Controller (DC).

Pacemaker centralizes all cluster decision making by electing one of the CRMd instances to act as a master.

Should the elected CRMd process (or the node it is on) fail, a new one is quickly established.

---

The DC carries out the PEngine’s instructions in the required order

  by passing them to either the Local Resource Management daemon (LRMd) or

CRMd peers on other nodes via the cluster messaging infrastructure (which in turn passes them on to their LRMd process).

---

The peer nodes all report the results of their operations back to the DC and, based on the expected and actual results,

will either execute any actions that needed to wait for the previous one to complete,

or abort processing and ask the PEngine to recalculate the ideal cluster state based on the unexpected results.

---

In some cases, it may be necessary to power off nodes in order to protect shared data or complete resource recovery.

For this, Pacemaker comes with STONITHd.

STONITH

Shoot-The-Other-Node-In-The-Head (usually implemented with a remote power switch)

Types of Pacemaker Clusters

  • Active/Active (i.e GFS2)
  • Active/Passive (i.e DRBD)

Doc

# Clusters_from_Scratch

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/

#redhat doc

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_overview/index

# a reference document

http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/

# pacemaker_remote

https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Remote/

 


Install Pacemaker

 

Centos 7

yum install pacemaker

# pcs for cluster management. Alternatives: crmsh

# pcs is a corosync and pacemaker configuration tool.

# In the dark past, configuring Pacemaker required the administrator to read and write XML

# ruby daemon program (pcs <->pcsd)

yum install pcs

pcs --version

0.9.169

# Other Tools

yum install psmisc -y

Start Pacemaker Servcie

systemctl start pcsd

systemctl enable pcsd

ps aux | grep pcsd

Disable Pacemaker Service Auto Start Service

We are not enabling the corosync and pacemaker services to auto start at boot

=> Gives you the opportunity to do a post-mortem investigation
      of a node failure before returning it to the cluster.

Remark: If a cluster node fails or is rebooted

# On node fails or is rebooted

pcs cluster start

Log

  • /var/log/pacemaker.log
  • /var/log/cluster/corosync.log
  • /var/log/messages

 


Perpare Cluster Member

 

[1] 方便 ssh

/etc/hosts

# MySetting
192.168.88.31 cs1
192.168.88.32 cs2

[2]

ssh-keygen -t dsa -f ~/.ssh/id_dsa -N ""

cp ~/.ssh/id_dsa.pub ~/.ssh/authorized_keys

scp -r ~/.ssh cs2:

ssh cs2 -- uname -n

[3]

# 2224/TCP(pcsd), 3121/TCP, and 21064/TCP, and 5405/UDP(corosync)

firewall-cmd --permanent --add-service=high-availability

firewall-cmd --reload

 


Config Pacemaker Auth

 

Config user hacluster password

# The installed packages will create a hacluster user with a disabled password.

# (While this is fine for running pcs commands locally)

# The account needs a login password in order to perform such tasks as

# syncing the corosync configuration, or starting and stopping the cluster on other nodes.

passwd hacluster # on cs1

ssh cs2 -- 'echo mysupersecretpassword | passwd --stdin hacluster'

# Authenticate pcs to pcsd on nodes specified

# (authorization tokens are stored in /var/lib/pcsd/tokens for root)

pcs cluster auth cs1 cs2

Username: hacluster
Password:
cs1: Authorized
Error: Unable to communicate with cs2   # firewall / service 問題

 


組成 Cluster

 

pcs cluster setup --name mycluster cs1 cs2

Destroying cluster on nodes: cs1, cs2...
cs1: Stopping Cluster (pacemaker)...
cs2: Stopping Cluster (pacemaker)...
cs1: Successfully destroyed cluster
cs2: Successfully destroyed cluster

Sending 'pacemaker_remote authkey' to 'cs1', 'cs2'
cs1: successful distribution of the file 'pacemaker_remote authkey'
cs2: successful distribution of the file 'pacemaker_remote authkey'
Sending cluster config files to the nodes...
cs1: Succeeded
cs2: Succeeded

Synchronizing pcsd certificates on nodes cs1, cs2...
cs1: Success
cs2: Success
Restarting pcsd on the nodes in order to reload the certificates...
cs1: Success
cs2: Success

pcs status

Error: cluster is not currently running on this node

pcs cluster start --all

cs1: Starting Cluster (corosync)...
cs2: Starting Cluster (corosync)...
cs1: Starting Cluster (pacemaker)...
cs2: Starting Cluster (pacemaker)...

Remark: start on single node

# On cs2

# 相當於 systemctl start corosync.service;

# 相當於 systemctl start pacemaker.service

pcs cluster start

# Verify Corosync Installation

pcs status

Cluster name: mycluster

WARNINGS:
No stonith devices and stonith-enabled is not false

Stack: corosync
Current DC: cs2 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Wed Mar 24 03:53:27 2021
Last change: Wed Mar 24 03:50:53 2021 by hacluster via crmd on cs2

2 nodes configured
0 resource instances configured

Online: [ cs1 cs2 ]

No resources


Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

# Check the communication

corosync-cfgtool -s

Printing ring status.
Local node ID 1
RING ID 0
        id      = 192.168.88.31
        status  = ring 0 active with no faults

# Check the membership

pcs status corosync

Membership information
----------------------
    Nodeid      Votes Name
         1          1 cs1 (local)
         2          1 cs2

corosync-cmapctl | grep members

runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(192.168.88.31)
runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1.status (str) = joined
runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(192.168.88.32)
runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.2.status (str) = joined

# Check the stack

ps axf

 2114 ?        Ssl    0:00 /usr/bin/ruby /usr/lib/pcsd/pcsd
 2210 ?        SLsl   0:04 corosync
 2231 ?        Ss     0:00 /usr/sbin/pacemakerd -f
 2232 ?        Ss     0:00  \_ /usr/libexec/pacemaker/cib
 2233 ?        Ss     0:00  \_ /usr/libexec/pacemaker/stonithd
 2234 ?        Ss     0:00  \_ /usr/libexec/pacemaker/lrmd
 2235 ?        Ss     0:00  \_ /usr/libexec/pacemaker/attrd
 2236 ?        Ss     0:00  \_ /usr/libexec/pacemaker/pengine
 2237 ?        Ss     0:00  \_ /usr/libexec/pacemaker/crmd

 


Two Node Cluster

 

To disable STONITH

  * completely inappropriate for a production cluster

In order to guarantee the safety of your data,

  fencing (also called STONITH) is enabled by default.

Resource start-up disabled since no STONITH resources have been defined

# It tells the cluster to simply pretend that failed nodes are safely powered off

pcs property set stonith-enabled=false

After shutdown both node

# On any node

pcs cluster start --all

P.S.

"pcs cluster start" 係 start 唔到 service 的, 因為 "partition WITHOUT quorum"

"partition WITHOUT quorum"

partition:

If a cluster splits into two (or more) groups of nodes that can no longer communicate with each other

crm_mon

Stack: corosync
Current DC: drbd-a (version 1.1.23-1.el7_9.1-9acf116022) - partition WITHOUT quorum
Last updated: Sun Feb 28 23:11:27 2021
Last change: Sun Feb 28 23:04:21 2021 by root via cibadmin on drbd-a

2 nodes configured
5 resource instances configured

Online: [ drbd-a ]
OFFLINE: [ drbd-b ]

No active resources

Pacemaker s default behavior is to stop all resources if the cluster does not have quorum.

(total_nodes < 2 * active_nodes) => 2<2*1

Two-node clusters are a special case.

corosync has the ability to treat two-node clusters as if only one node is required for quorum.

The "pcs cluster setup" command will automatically configure "two_node: 1" in corosync.conf,

    so a two-node cluster will "just work".

# In particular, we can tell the cluster to simply ignore quorum altogether.

crm configure property no-quorum-policy=ignore

crm configure show

 


CLI - pcs

 

# help

pcs status help

# 查看 pacemakerd 支援的功能

pacemakerd --features

Pacemaker 1.1.23-1.el7_9.1 (Build: 9acf116022)
 Supporting v3.0.14:  generated-manpages agent-manpages ncurses libqb-logging ...

# dump raw cluster configuration xml

pcs cluster cib

# validity of the configuration

crm_verify -L -V

 


Resource(Service)

 

Add a Resource

i.e. ClusterIP

pcs resource create ClusterIP ocf:heartbeat:IPaddr2 \
    ip=192.168.88.33 cidr_netmask=24

pcs status

Cluster name: mycluster
Stack: corosync
Current DC: cs2 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Wed Mar 24 04:18:18 2021
Last change: Wed Mar 24 04:18:11 2021 by root via cibadmin on cs1

2 nodes configured
1 resource instance configured

Online: [ cs1 cs2 ]

Full list of resources:

 ClusterIP      (ocf::heartbeat:IPaddr2):       Started cs1

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

Remove a resource

pcs resource delete ClusterIP

Attempting to stop: ClusterIP... Stopped

show

pcs resource show

 ClusterIP      (ocf::heartbeat:IPaddr2):       Started cs1

pcs resource show ClusterIP

 Resource: ClusterIP (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: cidr_netmask=24 ip=192.168.88.33
  Operations: monitor interval=10s timeout=20s (ClusterIP-monitor-interval-10s)
              start interval=0s timeout=20s (ClusterIP-start-interval-0s)
              stop interval=0s timeout=20s (ClusterIP-stop-interval-0s)

Other

describe [<standard>:[<provider>:]]<type> [--full]

enable <resource id>... [--wait[=n]]

disable <resource id>... [--safe [--no-strict]] [--simulate [--brief]] [--wait[=n]]

safe-disable <resource id>... [--no-strict] [--simulate [--brief]] [--wait[=n]] [--force]

Attempt to stop the resources if they are running and forbid the cluster from starting them again.

restart <resource id> [node] [--wait=n]

Restart the resource specified.

If a node is specified and if the resource is a clone or master/slave it will be restarted only on the node specified.

service httpd restart               # 錯誤 !!

pcs resource restart WebSite   # 正確

 


Failover Test

 

Test by stop the node service

# can be run from any node in the cluster

pcs cluster stop cs2

pcs status

Cluster name: mycluster
Stack: corosync
Current DC: cs1 (version 1.1.23-1.el7_9.1-9acf116022) - partition with quorum
Last updated: Wed Mar 24 04:22:01 2021
Last change: Wed Mar 24 04:18:11 2021 by root via cibadmin on cs1

2 nodes configured
1 resource instance configured

Online: [ cs1 ]
OFFLINE: [ cs2 ]

Full list of resources:

 ClusterIP      (ocf::heartbeat:IPaddr2):       Started cs1

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

Notice

cs2 is OFFLINE for cluster purposes, but its pcsd is still active (ps aux | grep pcsd),

allowing it to receive pcs commands, but it is not participating in the cluster

 

Test by put the node into standby mode

Standby mode:

continue to run corosync and pacemaker but are not allowed to run resources

pcs cluster standby cs1

pcs status

pcs cluster unstandby cs1

Test Failover - arping

arping 192.168.88.33

ARPING 192.168.88.33
60 bytes from 52:54:31:34:88:31 (192.168.88.33): index=0 time=30.763 usec

 


Stickiness

 

Prevent Resources from Moving after Recovery

By default, Pacemaker assumes there is zero cost associated with moving resources and will do so to achieve "optimal"

pcs resource defaults

No defaults set

pcs resource defaults resource-stickiness=100

Remove stickiness

i.e. disabling stickiness for the IP address resource(active-active)

pcs resource meta ClusterIP resource-stickiness=0

 


List Resource Providers(Agent)

 

Every primitive resource has a resource agent.

The cluster doesn’t need to understand how the resource works
 because it relies on the resource agent to do the right thing
 when given a start, stop or monitor command.

# To obtain a list of the available resource standards

pcs resource standards

lsb       # Linux Standard Base (/etc/init.d)
ocf       # Open Cluster Framework
service
systemd

# To obtain a list of the available OCF resource providers

pcs resource providers

heartbeat
linbit
openstack
pacemaker

# all the resource agents available for a specific OCF provider

pcs resource agents ocf:heartbeat

...
IPaddr
IPaddr2
...

查看 Agent 有什麼設定

describe [<standard>:[<provider>:]]<type> [--full]

i.e

pcs resource describe ocf:heartbeat:mysql | less

 


Resource Monitoring Operations

 

Properties of an Operation

on-fail

  • restart: Stop the resource and start it again (possibly on a different node).
  • ignore: Pretend the resource did not fail.
  • stop: Stop the resource and do not start it elsewhere.
  • fence: STONITH the node on which the resource failed.
  • standby: Move all resources away from the node on which the resource failed.

Disabling a Monitor Operation

cibadmin --modify --xml-text '<op id="public-ip-check" enabled="false"/>'

 


Resource - IPaddr2

 

pcs resource create ClusterIP ocf:heartbeat:IPaddr2 \
    ip=192.168.88.33 cidr_netmask=32 nic=eth0 op monitor interval=30s

ocf:heartbeat:IPaddr2

  • The first field is the standard to which the resource script conforms and where to find it.
  • The second field is standard-specific;
  • for OCF resources, it tells the cluster which OCF namespace the resource script is in.
  • The third field (IPaddr2 in this case) is the name of the resource script.

op monitor interval=30s

op = Operation

monitor = action

tell the cluster to check whether it is running every 30 seconds

monitor action fails for a resource, if interval is set to a nonzero value and role is set to Master

the resource will be demoted but will not be fully stopped.

 


Pacemaker with Apache

 

# Install

yum install -y httpd wget

# Firewall for http service

firewall-cmd --permanent --add-service=http

firewall-cmd --reload

# Testing Page

cat <<-END >/var/www/html/index.html
 <html>
 <body>My Test Site - $(hostname)</body>
 </html>
END

# Enable the Apache status URL(必須)

cat <<-END >/etc/httpd/conf.d/status.conf
 <Location /server-status>
    SetHandler server-status
    Require local
 </Location>
END

# Configure the Cluster

pcs resource create WebSite ocf:heartbeat:apache  \
      configfile=/etc/httpd/conf/httpd.conf \
      statusurl="http://localhost/server-status" \
      op monitor interval=1min

# Service 的 start/stop timeout

By default, the operation timeout for all resources' start, stop, and monitor operations is 20 seconds.

pcs resource op defaults

No defaults set

# 長一點時間

pcs resource op defaults timeout=240s

# Checking Apache status

# On cs1 & cs2

wget -O - http://localhost/server-status

 * WebSite resource has failed to start, then you’ve likely not enabled the status URL correctly

Colocation Constraint

# Ensure Resources Run on the Same Host

We use a colocation constraint that indicates it is mandatory for WebSite to run on the same node as ClusterIP.

The "mandatory" part of the colocation constraint is indicated by using a score of INFINITY.

The INFINITY score also means that if ClusterIP is not active anywhere, WebSite will not be permitted to run.

Colocation constraints are "directional" WebSite->ClusterIP

pcs resource show

 ClusterIP      (ocf::heartbeat:IPaddr2):       Started cs1
 WebSite        (ocf::heartbeat:apache):        Started cs2

pcs constraint colocation add WebSite with ClusterIP INFINITY

pcs constraint

Location Constraints:
Ordering Constraints:
Colocation Constraints:
  WebSite with ClusterIP (score:INFINITY)
Ticket Constraints:

Ensure Resources Start and Stop in Order

pcs constraint order ClusterIP then WebSite

Adding ClusterIP WebSite (kind: Mandatory) (Options: first-action=start then-action=start)

pcs constraint

Location Constraints:
Ordering Constraints:
  start ClusterIP then start WebSite (kind:Mandatory)
Colocation Constraints:
  WebSite with ClusterIP (score:INFINITY)
Ticket Constraints:

Prefer One Node Over Another

pcs constraint location WebSite prefers cs1=50

pcs constraint

Location Constraints:
  Resource: WebSite
    Enabled on: cs1 (score:50)
Ordering Constraints:
  start ClusterIP then start WebSite (kind:Mandatory)
Colocation Constraints:
  WebSite with ClusterIP (score:INFINITY)
Ticket Constraints:

  * prefers < stickiness 就不會跳

# To see the current placement scores

crm_simulate -sL

Current cluster status:
Online: [ cs1 cs2 ]

 ClusterIP      (ocf::heartbeat:IPaddr2):       Started cs1
 WebSite        (ocf::heartbeat:apache):        Started cs1

Allocation scores:
pcmk__native_allocate: ClusterIP allocation score on cs1: 50
pcmk__native_allocate: ClusterIP allocation score on cs2: 0
pcmk__native_allocate: WebSite allocation score on cs1: 50
pcmk__native_allocate: WebSite allocation score on cs2: -INFINITY

Transition Summary:

# To remove the constraint with the score of 50

1) get the constraint’s ID

pcs constraint --full

2) remove by ID

pcs constraint remove <ID>

 


Move Resources Manually

 

# Move 到另一存機

pcs resource move WebSite

OR

pcs resource move ClusterIP

說明

move <resource id> [destination node] [--master] [lifetime=<lifetime>] [--wait[=n]]

# by creating a temporary -INFINITY location constraint to ban the node.

# If destination node is specified the resource will be moved to that node by

# creating an INFINITY location constraint to prefer the destination node.

# 當成功跳到 cs2 後

pcs resource clear WebSite

說明

clear <resource id> [node] [--master] [--expired] [--wait[=n]]

Removes all temporary constraints previously created by "pcs resource move" or "pcs resource ban"

on the specified resource (and node if specified).

 


Active-Passive drbd (clone)

 

1) 設定好 drbd 的 r0

2) 設定由 Pacemaker 管理 drbd 的 start / stop

systemctl disable drbd

echo drbd >/etc/modules-load.d/drbd.conf

reboot

lsmod | grep drbd    # output 有 drbd

3) Configure the Cluster for the DRBD device

pcs has is the ability to queue up several changes into a file and commit those changes all at once.

To do this, start by populating the file with the current raw XML config from the CIB.

# 會在當前 Folder 建立了 drbd_cfg

pcs cluster cib drbd_cfg

# 查看改了什麼

pcs -f drbd_cfg resource show

pcs -f fs_cfg constraint

# Commit

pcs cluster cib-push drbd_cfg --config

4) 設定 drbd

pcs -f drbd_cfg resource create WebData ocf:linbit:drbd \
         drbd_resource=r0 op monitor interval=60s

# clone resource to allow the resource to run on both nodes at the same time.

pcs -f drbd_cfg resource master WebDataClone WebData

Or

pcs -f drbd_cfg resource master WebDataClone WebData \
         master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 \
         notify=true

說明

  • master
    master name-of-the-clone previously-created-resource
    Configure a resource or group as a multi-state (master/slave) resource.
  • master-max
    How many copies of the resource can be promoted to master status
    Default: 1
  • master-node-max
    How many copies of the resource can be promoted to master status on a single node
    Default 1
  • clone-max
    How many copies of the resource to start
    Default: number of nodes in cluster
  • clone-node-max
    How many copies of the resource can be started on a single node
    Default: 1
  • notify
    When stopping or starting a copy of the clone,
    tell all the other copies beforehand and again when the action was successful.
    通常 Set 'notify=true' 代表 Resource 係 master/slave instance
    Default: true

After you are satisfied with all the changes, you can commit them all at once

    by pushing the drbd_cfg file into the live CIB.

# --config is the same as scope=configuration. Use of --config is recommended.

# Do not specify a scope if you need to push the whole CIB or be warned in the case of outdated CIB.

pcs cluster cib-push drbd_cfg --config

CIB updated

pcs status

5) 設定 mount point

mkdir /home/cluster

touch /home/cluster/mountpoint

pcs cluster cib fs_cfg

# Config WebFS(mount point)

pcs -f fs_cfg resource create WebFS Filesystem \
        device="/dev/drbd0" directory="/home/cluster/default" fstype="xfs"

pcs -f fs_cfg constraint colocation add \
        WebFS with WebDataClone INFINITY with-rsc-role=Master

說明

  • with-rsc-role
    Multi-state Constraints, specifies the role that with-rsc must be in
    Allowed values: Started(Default), Master, Slave

pcs -f fs_cfg constraint order \
        promote WebDataClone then start WebFS

說明

  • promote
    Promote the resource from a slave resource to a master resource

# WebFS > WebSite

pcs -f fs_cfg constraint colocation add WebSite with WebFS INFINITY

pcs -f fs_cfg constraint order WebFS then WebSite

# Show

pcs -f fs_cfg constraint

# Commit

pcs cluster cib-push fs_cfg --config

# Check

pcs resource show

 


MySQL with Pacemaker

 

pcs cluster cib db_cfg

pcs -f db_cfg resource create MySQL ocf:heartbeat:mysql \
  datadir="/home/cluster/mysql"

pcs -f db_cfg constraint colocation add MySQL with WebFS INFINITY

pcs -f db_cfg constraint order WebFS then MySQL

pcs -f db_cfg constraint

pcs cluster cib-push db_cfg --config

 


Update Cluster Setting

 

pcs resource disable WebSite
pcs resource disable WebFS

pcs resource update WebFS Filesystem directory="/home/cluster"

pcs resource enable WebFS
pcs resource enable WebSite

 


Resource in manage mode

 

unmanaged mode:

which indicates that the resource is still in the configuration but Pacemaker does not manage the resource.

pcs resource unmanage resource1  [resource2] ...

pcs resource manage resource1  [resource2] ...

 


Failcount

 

pcs resource failcount show

 

pcs resource failcount reset

 


Active-Active drbd

 

Configure the Cluster for the DLM

pcs cluster cib dlm_cfg

pcs -f dlm_cfg resource create dlm \
        ocf:pacemaker:controld op monitor interval=60s

pcs -f dlm_cfg resource clone dlm clone-max=2 clone-node-max=1

pcs -f dlm_cfg resource show

pcs cluster cib-push dlm_cfg --config

Reconfigure the Cluster for GFS2

# fstype option needs to be updated

pcs resource show WebFS

pcs resource update WebFS fstype=gfs2

# GFS2 requires that DLM be running

pcs constraint colocation add WebFS with dlm-clone INFINITY

pcs constraint order dlm-clone then WebFS

drbd

drbd as a background service in a pacemaker cluster, 所以 auto-promote

common {
  options {
    # DRBD automatically sets itself Primary when needed.
    auto-promote yes;
    ...
  }
}

Using resource-level fencing in Pacemaker clusters

By the Cluster Information Base

If the DRBD replication link becomes disconnected, the crm-fence-peer.9.sh script contacts the cluster manager,

determines the Pacemaker Master/Slave resource associated with this DRBD resource,

and ensures that the Master/Slave resource no longer gets promoted on any node other than the currently active one.

resource <resource> {
  net {
    fencing resource-only;
    ...
  }
  handlers {
    fence-peer "/usr/lib/drbd/crm-fence-peer.9.sh";
    unfence-peer "/usr/lib/drbd/crm-unfence-peer.9.sh";
    ...
  }
  ...
}

By Resource-level fencing with dopd(DRBD outdate-peer daemon)

/etc/ha.d/ha.cf

respawn hacluster /usr/lib/heartbeat/dopd
apiauth dopd gid=haclient uid=hacluster

/etc/init.d/heartbeat reload

ps ax | grep dopd

Once dopd is running, add these items to your DRBD resource configuration:

resource <resource> {
    handlers {
        fence-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5";
        ...
    }
    net {
        fencing resource-only;
        ...
    }
    ...
}

Testing dopd functionality

unplugging the network link

After this, you will be able to observe the resource connection state change from Connected to Connecting.
Allow a few seconds to pass, and you should see the disk statebecome Outdated/DUnknown.
That is what dopd is responsible for.

 


Clone the IP address

 

It will utilize a multicast MAC address to have the local switch send the relevant packets to all nodes in the cluster,

together with iptables clusterip rules on the nodes so that any given packet will be grabbed by exactly one node.

 


Tools: crm_mon

 

crm_mon - Provides a summary of cluster's current state.

-1, --one-shot            # Display the cluster status once on the console and exit

-R, --show-detail        # Show more details (node IDs, individual clone instances)

-b, --brief                  # Brief output

-s, --simple-status       # Display the cluster status once as a simple one line output (suitable for nagios)

-d, --daemonize           # Run in the background as a daemon

-p, --pid-file=value      # (Advanced) Daemon pid file location

 


同步 /etc 內的 config

 

i.e.

# On cs1

rsync -av /etc/httpd cs2:/etc

rsync -av /etc/ssh cs2:/etc          # ssh ClusterIP 時不出 warning

 


Stonithd

 

STONITH (Shoot The Other Node In The Head aka. fencing)

The only way to be 100% sure that your data is safe
ensure that the node is truly offline before allowing the data to be accessed from another node.

If the device does not know how to fence nodes based on their uname,
you may also need to set the special pcmk_host_map parameter.

If the device does not support the list command,
you may also need to set the special pcmk_host_list and/or pcmk_host_check parameters.

If the device does not expect the victim to be specified with the port parameter,
you may also need to set the special pcmk_host_argument parameter

Help:

man stonithd

Package:

yum search fence-

# HW

  • fence-agents-ipmilan.x86_64 : Fence agent for devices with IPMI interface
  • fence-agents-ilo-ssh.x86_64 : Fence agent for HP iLO devices via SSH

# VM (libvirt)

  • fence-virt    # fence-virt(VM) -> fence-virtd(Hypervisor)[multicast/tcp] -> fence-virtd-libvirt
  • fence-agents-vmware-rest.x86_64 : Fence agent for VMWare with REST API

# Power(UPS)

  • fence-agents-apc.x86_64 : Fence agent for APC devices
  • fence-agents-apc-snmp.x86_64 : Fence agent for APC devices (SNMP)

# Cloud

  • fence-agents-aws.x86_64 : Fence agent for Amazon AWS
  • fence-agents-gce.x86_64 : Fence agent for GCE (Google Cloud Engine)
  • fence-agents-aliyun.x86_64 : Fence agent for Alibaba Cloud (Aliyun)
  • fence-agents-azure-arm.x86_64 : Fence agent for Azure Resource Manager

Fence_virtd

VM(fence_xvm) -> Host(fence_virtd)

Fence_virtd is a daemon which runs on physical hosts of the cluster hosting the virtual cluster.

It listens on a port for multicast traffic from virtual cluster(s), and takes actions.

Operation of a node hosting a VM which needs to be fenced:

  (a) Receive multicast packet

  (b) Authenticate multicast packet

  (c) Open connection to host contained within multicast packet.

  (d) Authenticate server.

  (e) Carry out fencing operation

       (e.g. call libvirt to destroy or reboot the VM; there is no "on" method at this point).

  (f) If operation succeeds, send success response.

  * Security considerations

   - a DoS of fence_virtd if enough multicast packets are sent

   - An attacker with access to the shared key(s) can easily fence any known VM,
     even if they are not on a cluster node.

# Configuration

Fence Device Server

# Install Package

yum -y install fence-virtd fence-virtd-libvirt fence-virtd-multicast

# Generate a random key file (512 bytes)

mkdir /etc/cluster

dd if=/dev/random of=/etc/cluster/fence_xvm.key bs=512 count=1

# Start Service

fence_virtd -c

systemctl start fence_virtd && systemctl enable fence_virtd

# Firewall

firewall-cmd --permanent --add-port=1229/tcp

firewall-cmd --permanent --add-port=1229/udp

firewall-cmd --reload

VM node

1) Install the STONITH agent(s)

# 找

yum search fence-

# 安

yum -y install fence-virt

# 看

pcs stonith list

pcs stonith describe agent_name

# firewall

firewall-cmd --permanent --add-port=1229/tcp

firewall-cmd --permanent --add-port=1229/tcp

firewall-cmd --reload

# 設

pcs cluster cib stonith_cfg

# Any flags that do not take arguments, such as --ssl, should be passed as ssl=1.

pcs -f stonith_cfg stonith create stonith_id stonith_device_type [stonith_device_options]

pcs -f stonith_cfg property set stonith-enabled=true

pcs -f stonith_cfg property

pcs cluster cib-push stonith_cfg --config

# Once the STONITH resource is running, test it

Status & Test

# On drbd-b

pcs cluster stop drbd-a

stonith_admin --reboot drbd-a

fence_xvm -o list

fence_xvm -a 225.0.0.12 -k /etc/cluster/fence_xvm.key -H guest2 -o status

fence_xvm -a 225.0.0.12 -k /etc/cluster/fence_xvm.key -H guest2 -o reboot

# power-off

fence_xvm -H pcmk2 -o off && echo $?

# power-on

fence_xvm -H pcmk2 -o on && echo $?

Configuring fence_xvm

pcs stonith create fence_pcmk1_xvm fence_xvm port="pcmk1" pcmk_host_list="pcmk1"

pcs stonith create fence_pcmk2_xvm fence_xvm port="pcmk2" pcmk_host_list="pcmk2"

pcs status

pcs property set stonith-enabled=true

pcs config show

Test

# stonith pcmk2

pcs stonith fence pcmk2

# 在某中一隻 VM 上模擬死機:

echo c > /proc/sysrq-trigger

 


 

 

 

 

 

Creative Commons license icon Creative Commons license icon