Professional Documents
Culture Documents
100-002685-A
COURSE DEVELOPERS
Bilge Gerrits
Steve Hoffer
Siobhan Seeger
Pete Toemmes
Graeme Gofton
Sean Nockles
Brad Willer
TECHNICAL
CONTRIBUTORS AND
REVIEWERS
Geoff Bergren
Kelli Cameron
Tomer Gurantz
Anthony Herr
James Kenney
Gene Henriksen
Bob Lucas
Paul Johnston
Rod Pixley
Clifford Barcliff
Danny Yonkers
Antonio Antonucci
Satoko Saito
Feng Liu
Table of Contents
Course Introduction
Veritas Cluster Server curriculum path.................................................... Intro-2
Cluster design ......................................................................................... Intro-5
Courseware contents .............................................................................. Intro-7
Lesson 1: High Availability Concepts
High availability concepts ............................................................................. 1-3
Clustering concepts...................................................................................... 1-5
HA application services ................................................................................ 1-8
Clustering prerequisites.............................................................................. 1-12
High availability references ........................................................................ 1-14
Lesson 2: VCS Building Blocks
VCS terminology .......................................................................................... 2-3
Cluster communication............................................................................... 2-12
VCS architecture ........................................................................................ 2-17
Lesson 3: Preparing a Site for VCS
Hardware requirements and recommendations ........................................... 3-3
Software requirements and recommendations............................................. 3-5
Preparing installation information ............................................................... 3-10
Preparing to upgrade.................................................................................. 3-14
i
Copyright 2012 Symantec Corporation. All rights reserved.
ii
Table of Contents
iii
Copyright 2012 Symantec Corporation. All rights reserved.
iv
Course Introduction
The Veritas Cluster Server for UNIX curriculum is a series of courses that are
designed to provide a full range of expertise with Veritas Cluster Server (VCS)
high availability solutionsfrom design through disaster recovery.
Veritas Cluster Server for UNIX: Install and Configure
This course covers installation and configuration of common VCS
environments, focusing on two-node clusters running application and database
services.
Veritas Cluster Server for UNIX: Manage and Administer
This course focuses on multinode VCS clusters and advanced topics related to
managing more complex cluster configurations.
eLearning Library
The eLearning Library is available with bundled training options and includes
content on advanced high availability and disaster recovery features.
Intro2
Course overview
Course Introduction
Intro3
Copyright 2012 Symantec Corporation. All rights reserved.
The second part of the VCS Administration course includes two books:
Example Application Configurations
This book describes how to cluster applications, databases, and NFS file
sharing services.
Cluster Management
This book describes how to customize service groups to implement more
complex configurations. Also covered are high availability and disaster
recovery solutions in enterprise environments.
10
Intro4
Cluster design
Sample cluster design input
A VCS design can be presented in many different formats with varying levels of
detail.
In some cases, you may have only the information about the application services
that need to be clustered and the desired operational behavior in the cluster. For
example, you may be told that the application service uses multiple network ports
and requires local failover capability among those ports before it fails over to
another system.
In other cases, you may have the information you need as a set of diagrams with
notes on various aspects of the desired cluster operations.
11
If you receive the design information that does not detail the resource information,
develop a detailed design worksheet before starting the deployment.
Using a design worksheet to document all aspects of your high availability
environment helps ensure that you are well-prepared to start implementing your
cluster design.
In this course, you are provided with a set of design worksheets showing sample
values as a tool for implementing the cluster design in the lab exercises.
You can use a similar format to collect all the information you need before starting
deployment at your site.
Course Introduction
Intro5
Copyright 2012 Symantec Corporation. All rights reserved.
12
Intro6
Courseware contents
This course consists of slides for each lesson that feature concepts, processes, and
examples. Each lab is introduced at the end of the lesson, explaining the goals of
the hands-on exercises. Quiz slides are provided to reinforce your understanding of
the lesson objectives.
The participant guides include a copy of the slide, along with supplementary
content with additional details supporting the slide content.
13
Course Introduction
Intro7
Copyright 2012 Symantec Corporation. All rights reserved.
Element
Examples
Courier New,
bold
Command input,
both syntax and
examples
Courier New,
plain
Command output
Command
names, directory
names, file
names, path
names, URLs
when used within
regular text
paragraphs
In the output:
protocol_minimum: 40
protocol_maximum: 60
protocol_current: 0
Locate the altnames directory.
Go to http://www.symantec.com.
Enter the value 300.
Log on as user1.
Courier New,
Italic, bold or
plain
Variables in
command syntax,
and examples:
Variables in
command input
are Italic, plain.
Variables in
command output
are Italic, bold.
14
Intro8
Convention
Element
Examples
Arrow
Initial capitalization
Click Next.
Open the Task Status
window.
Clear the Print File check
box.
Bold
Interface elements
Lesson 1
15
16
12
Levels of availability
17
13
Costs of downtime
A Gartner study shows that large companies experienced a loss of between
$954,000 and $1,647,000 (USD) per month for nine hours of unplanned
downtime.
In addition to the monetary loss, downtime also results in loss of business
opportunities and reputation.
Planned downtime is almost as costly as unplanned. Planned downtime can be
significantly reduced by migrating a service to another server while maintenance is
performed.
Given the magnitude of the cost of downtime, the case for implementing a high
availability solution is clear.
18
14
Clustering concepts
The term cluster refers to multiple independent systems connected into a
management framework.
Types of clusters
19
15
20
16
21
17
HA application services
An application service is a collection of hardware and software components
required to provide a service, such as a Web site, that an end-user can access by
connecting to a particular network IP address or host name. Each application
service typically requires components of the following three types:
Application binaries (executables)
Network
Storage
If an application service needs to be switched to another system, all of the
components of the application service must migrate together to re-create the
service on another system.
Copyright 2012 Symantec Corporation. All rights reserved.
These are the same components that the administrator must manually move from a
failed server to a working server to keep the service available to clients in a
nonclustered environment.
22
18
The process of stopping the application services on one system and starting it on
another system in response to a fault is referred to as a failover.
23
19
24
110
25
111
Clustering prerequisites
Hardware and infrastructure redundancy
All failovers cause some type of client disruption. Depending on your
configuration, some applications take longer to fail over than others. For this
reason, good design dictates that the HA software first try to fail over within the
system, using agents that monitor local resources.
Design as much resiliency as possible into the individual servers and components
so that you do not have to rely on any hardware or software to cover a poorly
configured system or application. Likewise, try to use all resources to make
individual servers as reliable as possible.
Single point of failure analysis
Copyright 2012 Symantec Corporation. All rights reserved.
Determine whether any single points of failure exist in the hardware, software, and
infrastructure components within the cluster environment.
26
Any single point of failure becomes the weakest link of the cluster. The application
is equally inaccessible if a client network connection fails, or if a server fails.
Also consider the location of redundant components. Having redundant hardware
equipment in the same location is not as effective as placing the redundant
component in a separate location.
In some cases, the cost of redundant components outweighs the risk that the
component will become the cause of an outage. For example, buying an additional
expensive storage array may not be practical. Decisions about balancing cost
versus availability need to be made according to your availability requirements.
112
External dependencies
Whenever possible, it is good practice to eliminate or reduce reliance by high
availability applications on external services. If it is not possible to avoid outside
dependencies, ensure that those services are also highly available.
For example, network name and information services, such as DNS (Domain
Name System) and NIS (Network Information Service), are designed with
redundant capabilities.
27
113
28
114
Lesson 2
29
30
22
VCS terminology
VCS cluster
31
23
Service groups
A service group is a virtual container that enables VCS to manage an application
service as a unit. The service group contains all the hardware and software
components required to run the service. The service group enables VCS to
coordinate failover of the application service resources in the event of failure or at
the administrators request.
32
24
33
25
Resources
Resources are VCS objects that correspond to hardware or software components,
such as the application, the networking components, and the storage components.
VCS controls resources through these actions:
Bringing a resource online (starting)
Taking a resource offline (stopping)
Monitoring a resource (probing)
Resource categories
Persistent, never turned off
None
VCS can only monitor persistent resourcesthese resources cannot be
brought online or taken offline. The most common example of a persistent
resource is a network interface card (NIC), because it must be present but
cannot be stopped.
On-only
VCS brings the resource online if required but does not stop the resource if
the associated service group is taken offline. ProcessOnOnly is a resource
used to start, but not stop a process such as daemon, for example.
Nonpersistent, also known as on-off
Most resources fall into this category, meaning that VCS brings them online
and takes them offline as required. Examples are Mount, IP, and Process.
34
26
Resource dependencies
Resources depend on other resources because of application or operating system
requirements. Dependencies are defined to configure VCS for these requirements.
Dependency rules
35
27
Resource attributes
Resources attributes define the specific characteristics on individual resources. As
shown in the slide, the resource attribute values for the sample resource of type
Mount correspond to the UNIX command line to mount a specific file system.
VCS uses the attribute values to run the appropriate command or system call to
perform an operation on the resource.
Each resource has a set of required attributes that must be defined in order to
enable VCS to manage the resource.
For example, the Mount resource has four required attributes that must be defined
for each resource of type Mount:
The directory of the mount point (MountPoint)
The device for the mount point (BlockDevice)
The type of file system (FSType)
The options for the fsck command (FsckOpt)
36
The first three attributes are the values used to build the UNIX mount command
shown in the slide. The FsckOpt attribute is used if the mount command fails. In
this case, VCS runs fsck with the specified options (-y, which means answer yes
to all fsck questions) and attempts to mount the file system again.
Some resources also have additional optional attributes you can define to control
how VCS manages a resource. In the Mount resource example, MountOpt is an
optional attribute you can use to define options to the UNIX mount command.
For example, if this is a read-only file system, you can specify -ro as the
MountOpt value.
28
37
You can view the relationship between resources and resource types by comparing
the mount command for a resource on the previous slide with the mount syntax
on this slide. The resource type defines the syntax for the mount command. The
resource attributes fill in the values to form an actual command line.
29
Agents control resources using a defined set of actions, also called entry points.
The four entry points common to most agents are:
Online: Resource startup
Offline: Resource shutdown
Monitor: Probing the resource to retrieve status
Clean: Killing the resource or cleaning up as necessary when a resource fails to
be taken offline gracefully
38
The difference between offline and clean is that offline is an orderly termination
and clean is a forced termination. In UNIX, this can be thought of as the difference
between exiting an application and sending the kill -9 command to the
process.
Each resource type needs a different way to be controlled. To accomplish this, each
agent has a set of predefined entry points that specify how to perform each of the
four actions. For example, the startup entry point of the Mount agent mounts a
block device on a directory, whereas the startup entry point of the IP agent uses the
ifconfig (Solaris, AIX, HP-UX) or ip addr add (Linux) command to set
the IP address on a unique IP alias on the network interface.
VCS provides both predefined agents and the ability to create custom agents.
210
39
Note: The Veritas Cluster Users Guide provides an appendix with a complete
description of attributes for all cluster objects.
To obtain PDF versions of product documentation for VCS and agents, see the
SORT Web site.
211
Cluster communication
VCS requires a cluster communication channel between systems in a cluster to
serve as the cluster interconnect. This communication channel is also sometimes
referred to as the private network because it is often implemented using a
dedicated Ethernet network.
Symantec recommends that you use a minimum of two dedicated communication
channels with separate infrastructuresfor example, multiple NICs and separate
network hubsto implement a highly available cluster interconnect.
40
212
Low-Latency Transport
41
LLT runs directly on top of the Data Link Provider Interface (DLPI) layer over
Ethernet and has several major functions:
Sending and receiving heartbeats over network links
Monitoring and transporting network traffic over multiple network links to
every active system
Balancing the cluster communication load over multiple links
Maintaining the state of communication
Providing a transport mechanism for cluster communications
213
42
214
I/O fencing
The fencing driver implements I/O fencing, which prevents multiple systems from
accessing the same Volume Manager-controlled shared storage devices in the
event that the cluster interconnect is severed. In the example of a two-node cluster
displayed in the diagram, if the cluster interconnect fails, each system stops
receiving heartbeats from the other system.
GAB on each system determines that the other system has failed and passes the
cluster membership change to the fencing module.
43
The fencing modules on both systems contend for control of the disks according to
an internal algorithm. The losing system is forced to panic and reboot. The
winning system is now the only member of the cluster, and it fences off the shared
data disks so that only systems that are still part of the cluster membership (only
one system in this example) can access the shared storage.
The winning system takes corrective action as specified within the cluster
configuration, such as bringing service groups online that were previously running
on the losing system.
215
This modularity between had and the agents allows for efficiency of roles:
HAD does not need to know how to start up Oracle or any other applications
that can come under VCS control.
Similarly, the agents do not need to make cluster-wide decisions.
44
This modularity allows a new application to come under VCS control simply by
adding a new agentno changes to the VCS engine are required.
On each active cluster system, HAD updates all the other cluster systems with
changes to the configuration or status.
In order to ensure that the had daemon is highly available, a companion daemon,
hashadow, monitors had, and if had fails, hashadow attempts to restart had.
Likewise, had restarts hashadow if hashadow stops.
216
VCS architecture
Maintaining the cluster configuration
HAD maintains configuration and state information for all cluster resources in
memory on each cluster system. Cluster state refers to tracking the status of all
resources and service groups in the cluster. When any change to the cluster
configuration occurs, such as the addition of a resource to a service group, HAD
on the initiating system sends a message to HAD on each member of the cluster by
way of GAB atomic broadcast, to ensure that each system has an identical view of
the cluster.
Atomic means that all systems receive updates, or all systems are rolled back to the
previous state, much like a database atomic commit.
45
The cluster configuration in memory is created from the main.cf file on disk in
the case where HAD is not currently running on any cluster systems, so there is no
configuration in memory. When you start VCS on the first cluster system, HAD
builds the configuration in memory on that system from the main.cf file.
Changes to a running configuration (in memory) are saved to disk in main.cf
when certain operations occur. These procedures are described in more detail later
in the course.
217
46
218
47
Labs and solutions for this lesson are located on the following pages.
Lab environment, page A-3.
Lab environment, page B-3.
219
48
220
Lesson 3
49
50
32
51
33
Networking
For a highly available configuration, each system in the cluster should have a
minimum of two physically independent Ethernet connections for the public
network. Using the same interfaces on each system simplifies configuring and
managing the cluster.
Shared storage
VCS is designed primarily as a shared data high availability product; however, you
can configure a cluster that has no shared storage.
52
Software requirements
53
35
Software recommendations
54
36
Solaris
55
Solaris operating systems can be paused and resumed using the Stop-A and go
sequence. When a Solaris system in a VCS cluster is paused with the Stop-A, the
system stops producing VCS heartbeats. This causes other systems to consider this
a failed node.
Ensure that the only action possible after an abort is a reset. To ensure that you
never issue a go function after an abort, create an alias for the go function that
displays a message. See the Veritas Cluster Server Installation Guide for the
detailed procedure.
37
Preparation assistance
Several tools are available from the Symantec Operations Readiness Tools (SORT)
Web site to help you prepare your environment to implement clustering.
Data collection and reporting tools
A data collector can be run from the Web site, or downloaded locally, to gather
system information, run preinstallation checks, and generate reports.
Documentation and compatibility lists
All product documentation, as well as software and hardware compatibility
lists are available from SORT.
Preparation checklists
Platform-specific checklists can be created to assist in preparing an
environment for clustering.
Patch management
SORT provides access to all products in the Storage Foundation HA family.
Risk assessment
Checklists and reports can be used to analyze your environment and identify
risks and recommend remedies.
Error code lookup
SORT enables you to search for additional information about error messages.
You can also request help for undocumented error codes.
Inventory management service
Inventory management is a service that provides the ability to gather license
information from Storage Foundation HA deployments.
56
38
Alternately, you can run installvcs from the location of your VCS product
distribution to check your environment and examine the resultant log file to assess
readiness to install VCS.
cd sw_location
57
39
Verify that you have the information necessary to install VCS. Be prepared to
select:
Product, corresponding to licenses obtained from Symantec
End-user licensing agreement
Package set, which determines the amount of disk space required
Names of the systems that will be installed with the selected product
License keys or keyless licensing
Product level, which applies to keyless licensing, and determines the level of
functionality of the product
Product options, including Veritas Replicator and Global Clustering Option
58
For more information about these selections, see the Veritas Cluster Server
Installation Guide.
310
You are prompted to configure the cluster after the software installation is
complete. Be prepared to supply:
A name for the cluster, beginning with a letter of the alphabet (a-z, A-Z)
A unique ID number for the cluster in the range 0 to 64k
All clusters sharing a private network infrastructure (including connection to
the same public network if used for low-priority links) must have a unique ID.
Device names of the network interfaces used for the cluster interconnect
59
311
You can also prevent duplicate cluster IDs by opting to have CPI automatically
generate the cluster ID.
60
312
You may want to use a design worksheet to collect the information required to
install VCS as you prepare the site for VCS deployment. You can then use this
worksheet later when you are installing VCS.
61
313
Preparing to upgrade
Checking versions of installed products
CPI provides a simple and fast way to check and display the version of SFHA
software installed on a server. The output displays the version information down to
the patch level, and provides detailed lists of installed and missing packages
showing which ones are required and which ones are optional.
If the system being checked has access to the SORT Web site, information about
the latest updates and newer releases available for the installed products is also
displayed.
62
314
If your environment includes any nonstandard elements that are not covered by the
checklist, such as custom applications or unsupported versions of third-party
multi-pathing products, contact Symantec Support and try to test the upgrade
process in a non-production environment first.
63
See the Veritas Cluster Server Installation Guide for more information about
planning for upgrades.
315
64
Labs and solutions for this lesson are located on the following pages.
Lab 3: Validating site preparation, page A-33.
Lab 3: Validating site preparation, page B-51.
316
Lesson 4
Installing VCS
65
66
42
67
At the end of every product installation, the installer creates three text files:
A log file containing any system commands executed and their output
A response file to be used in conjunction with the -responsefile option of
the installer
A summary file containing the output of the Veritas product installer scripts
These files are located in /opt/VRTS/install/logs. The names and
locations of each file are displayed at the end of each product
installationinstallertimestamp.log, .summary, and .response. It
is recommended that these logs be kept for auditing and debugging purposes.
43
Copyright 2012 Symantec Corporation. All rights reserved.
For a list of software packages that are installed, see the release notes for your
VCS version and platform.
68
Options to installvcs
The installvcs utility supports several options that enable you to tailor the
installation process. For example, you can:
Perform an unattended installation.
Install software packages without configuring a cluster.
Configure secure cluster communications.
Upgrade an existing VCS cluster.
For a complete description of installvcs options, see the VERITAS Cluster
Server Installation Guide.
44
Web installer
69
VCS includes a Web-based interface to the CPI installer. The key components of
the Web installer architecture are shown in the diagram in the slide.
The Web browser can be run on any platform that supports the browser
requirements and can connect securely (RSH or SSH) to the Web server.
The Web server runs the xprtlwid daemon, which is started using the
webinstaller command on the distribution media. The Web installer uses
the CPI installer scripts, and the software packages. Therefore, the system
acting as the Web server must have access to the software distribution media.
The Web server must be able to connect securely (RSH or SSH) to the
installation target systems.
The installation targets are the systems on which the software is installed and
configured.
The Web installer supports most features of the installer utility. See the Veritas
Cluster Server Installation Guide for a description of supported options. The guide
also includes the browser types and versions supported by the Web installer.
45
Copyright 2012 Symantec Corporation. All rights reserved.
Data protection
If you are using VCS with shared storage devices that support SCSI-3 Persistent
Reservations, configure fencing after VCS is initially installed.
SCSI-3-based fencing provides the highest level of protection for data that is
located on shared storage and accessed by multiple cluster nodes.
You can configure fencing at any time using the installvcs -fencing
utility, as described in the I/O Fencing lesson. However, if you set up fencing
after you have service groups running, you must stop and restart VCS for fencing
to take effect.
70
46
For details about secure cluster communication, see the Veritas Cluster Server
Installation Guide.
71
47
Copyright 2012 Symantec Corporation. All rights reserved.
For more information about IPv6, see the Web-based training module or the
Veritas Cluster Server Installation Guide.
72
48
For details about using operating system tools, see the Veritas Cluster Server
Installation Guide for the applicable platforms.
73
49
Copyright 2012 Symantec Corporation. All rights reserved.
74
Product documentation is not included with the software packages. You can
download all documentation from the SORT Web site.
410
75
411
Copyright 2012 Symantec Corporation. All rights reserved.
76
412
77
Linux
413
Copyright 2012 Symantec Corporation. All rights reserved.
Note: This command line shows status only if a module is using LLT, such as
GAB. If GAB is not running, the output shows a comm wait state.
78
The configured and active options show only nodes where LLT is
configured or active.
The lltconfig command just displays whether LLT is running, with no detail.
LLT is discussed in more detail later in the course. For now, you can see that LLT
is running using these commands.
414
This indicates that HAD and GAB are communicating on two nodes.
79
415
Copyright 2012 Symantec Corporation. All rights reserved.
The -sum option shows the status as a snapshot in time. If you run hastatus
with no options, the status is displayed continuously, showing any changes in the
state of systems, service groups, and resources as they occur. You can stop the
display by typing Ctrl-C.
80
416
Download the latest update for your version of VCS according to the instructions
provided on the Web site. The installation instructions for VCS updates are
included with the update pack.
81
Before you install an update, make sure all prerequisites are met. At the end of the
update installation, you may be prompted to run scripts to update agents or other
portions of the VCS configuration. Continue through any additional procedures to
ensure that the latest updates are applied.
417
Copyright 2012 Symantec Corporation. All rights reserved.
For more information about upgrade methods, refer to the installation guide for the
applicable platform and VCS version to which you are upgrading.
82
418
Managed hosts and the management server communicate securely using the
HTTPS protocol, through HTTP servers and clients implemented within the
XPRTL component of SFHA.
83
419
Copyright 2012 Symantec Corporation. All rights reserved.
Upon completion, the Web console is launched to enable you to log on to the
management server.
84
420
When connecting from a system on the network, use the fully-qualified host name
of the management server, or the IP address. For example:
https://vomserver.example.com:14161
On the log on page, you must select the user domain to enable the authentication
service to recognize user accounts. For example, the unixpwd domain
authenticates the login using the operating system network domain account.
85
For details about user account configuration, see the Veritas Operations Manager
Administrator's Guide.
421
Copyright 2012 Symantec Corporation. All rights reserved.
If a cluster or system within a cluster is shut down, the system shows as failed in
the VOM console. When the system or cluster is restarted, you do not need to add
the systems to VOM again. Simply refresh the VOM display after the systems are
running and the systems are again recognized by VOM.
86
422
Note: The Java GUI is being deprecated in favor of the VOM Web-based
administration tool and may not be supported for future versions of VCS.
Also, some features of VCS available in later versions are not supported by
the Java GUI.
87
While VOM is the preferred management tool for data center environments with
many clusters, or clusters using more advanced features available in the latest VCS
releases, the Java GUI can be a useful tool for managing small clusters.
To obtain the software, navigate to symantec.com, select Products > Cluster
Server and click the link labeled Veritas Cluster Server Java Console, Veritas
Cluster Server Simulator, Veritas Enterprise Administrator Console.
423
Copyright 2012 Symantec Corporation. All rights reserved.
88
Labs and solutions for this lesson are located on the following pages.
Lab 4: Installing Storage Foundation HA 6.0, page A-41.
Lab 4: Installing Storage Foundation HA 6.0, page B-69.
424
Lesson 5
VCS Operations
89
90
52
91
The Java GUI is available for download from the Symantec Web site and is
supported on Windows systems only. The Java GUI is deprecated in favor of
VOM, but is useful for management of clusters in smaller environments. The Java
GUI is the only interface for using the Simulator.
The Simulator is useful for learning about VCS and modeling behavior. You can
use the Simulator to create and test a cluster configuration, and then move that
configuration into a real-world environment. However, you cannot use the
Simulator to manage a running cluster configuration. The Simulator is supported
on Windows only.
53
Copyright 2012 Symantec Corporation. All rights reserved.
Displaying logs
The engine log is located in /var/VRTSvcs/log/engine_A.log. You can
view this file with standard UNIX text file utilities such as tail, more, or view.
VCS provides the hamsg utility that enables you to filter and sort the data in log
files.
In addition, you can display the engine log in Cluster Manager to see a variety of
views of detailed status information about activity in the cluster.
You can also view the command log to see how the activities you perform using
the Java GUI are translated into VCS commands. You can use the command log as
a resource for creating batch files to use when performing repetitive configuration
or administration tasks.
92
Note: The command log is not saved to diskyou can view commands only for
the current session of the GUI.
54
93
The following examples show how to display resource attributes and status.
Display values of attributes to ensure they are set properly.
hares -display webip
#Resource
Attribute
System
Value
. . .
webip
AutoStart
global
1
webip
Critical
global
1
Determine which resources are non-critical.
hares -list Critical=0
webapache
s1
webapache
s2
Determine the virtual IP address for the websg service group.
hares -value webip Address
#Resource
Attribute
System
Value
webip
Address
global
10.10.27.93
Determine the state of a resource on each cluster system.
hares -state webip
#Resource
Attribute
System
Value
webip
State
s1
OFFLINE
webip
State
s2
ONLINE
Lesson 5 VCS Operations
55
Copyright 2012 Symantec Corporation. All rights reserved.
The following examples show some common uses of the hagrp command for
displaying service group information and status.
Display values of all attributes to ensure they are set properly.
hagrp -display websg
#Group
Attribute
System
Value
. . .
websg
AutoFailOver global
1
websg
AutoRestart global
1
. . .
Determine which service groups are frozen, and are therefore not able to be
stopped, started, or failed over.
hagrp -list Frozen=1
websg
s1
websg
s2
Determine whether a service group is set to automatically start.
hagrp -value websg AutoStart
1
List the state of a service group on each system.
hagrp -state websg
#Group
Attribute
System
Value
websg
State
s1
|Online|
websg
State
s2
|Offline|
94
56
95
The state of persistent resources is not considered when determining the online or
offline state of a service group because persistent resources cannot be taken
offline. However, a service group is faulted if a persistent resource faults.
Bringing a service group online using the CLI
To bring a service group online, use either form of the hagrp command:
hagrp -online group -sys system
hagrp -online group -any
The -any option brings the service group online based on the groups failover
policy. Failover policies are described in detail later in the course.
57
Copyright 2012 Symantec Corporation. All rights reserved.
To take a service group offline, use either form of the hagrp command:
hagrp -offline group -sys system
Provide the service group name and the name of a system where the service
group is online.
hagrp -offline group -any
Provide the service group name. The -any switch takes a failover service
group offline on the system where it is online. All instances of a parallel
service group are taken offline when the -any switch is used.
96
58
97
Provide the service group name and the name of the system where the service
group is to be brought online.
59
Copyright 2012 Symantec Corporation. All rights reserved.
The slide shows how you can use the GUI and CLI together to develop an
understanding of how VCS responds to events in the cluster environment, and the
effects on application services under VCS control.
98
510
99
511
Copyright 2012 Symantec Corporation. All rights reserved.
100 512
Under VCS, the manipulation of resources that are part of service groups and the
service groups themselves need to be managed using VCS utilities, such as the
GUI or CLI, with full awareness of resource and service group dependencies.
Alternately, you can freeze the service group to prevent VCS from taking action
when changes in resource status are detected.
Warning: In clusters that do not implement fencing, VCS cannot prevent someone
with proper permissions from manually starting another instance of the application
on another system outside of VCS control. VCS will eventually detect this and
take corrective action, but it may be too late to prevent data corruption.
Resource operations
Bringing resources online
In normal day-to-day operations, you perform most management operations at the
service group level.
However, you may need to perform maintenance tasks that require one or more
resources to be offline while others are online. Also, if you make errors during
resource configuration, you can cause a resource to fail to be brought online.
Bringing resources online using the CLI
To bring a resource online, type:
101
Provide the resource name and the name of a system that is configured to run the
service group.
Note: The service group shown in the slide is partially online after the webdg
resource is brought online. This is depicted by the textured coloring of the
service group circle.
513
Copyright 2012 Symantec Corporation. All rights reserved.
Taking a resource offline and immediately bringing it online may be necessary if,
for example, the resource must reread a configuration file due to a change. Or you
may need to take a database resource offline in order to perform an update that
modifies the database files.
102 514
103
515
Copyright 2012 Symantec Corporation. All rights reserved.
You can use the Cluster Manager Java Console to perform all the same tasks as an
actual cluster configuration. Additional options are available for Simulator
configurations to enable you to test various failure scenarios, including faulting
resources and powering off systems.
104 516
105
Labs and solutions for this lesson are located on the following pages.
Lab 5: Performing common VCS operations, page A-53.
Lab 5: Performing common VCS operations, page B-107.
517
Copyright 2012 Symantec Corporation. All rights reserved.
106 518
Lesson 6
107
108 62
109
The default VCS startup process is demonstrated using a cluster with two systems
connected by the cluster interconnect. To illustrate the process, assume that no
systems have an active cluster configuration.
1 The hastart command is run on s1 and starts the had and hashadow
processes.
2 HAD checks for a valid configuration file (hacf -verify config_dir).
3 HAD checks for an active cluster configuration on the cluster interconnect.
4 Because there is no active cluster configuration, HAD on s1 reads the local
main.cf file and loads the cluster configuration into local memory.
The s1 system is now in the VCS local build state, meaning that VCS is
building a cluster configuration in memory on the local system.
63
6
7
110
10
11
64
The hastart command is then run on s2 and starts had and hashadow on
s2.
The s2 system is now in the VCS current discover wait state, meaning VCS is
in a wait state while it is discovering the current state of the cluster.
HAD on s2 checks for a valid configuration file on disk.
HAD on s2 checks for an active cluster configuration by sending a broadcast
message out on the cluster interconnect, even if the main.cf file on s2 is
valid.
HAD on s1 receives the request from s2 and responds.
HAD on s1 sends a copy of the cluster configuration over the cluster
interconnect to s2.
The s1 system is now in the VCS running state, meaning VCS determines that
there is a running configuration in memory on system s1.
The s2 system is now in the VCS remote build state, meaning VCS is building
the cluster configuration in memory on the s2 system from the cluster
configuration that is in a running state on s1.
HAD on s2 performs a remote build to place the cluster configuration in
memory.
When the remote build process completes, HAD on s2 copies the cluster
configuration into the local main.cf file.
If s2 has valid local configuration files (main.cf and types.cf), these are
saved to new files with a name, including a date and time stamp, before the
active configuration is written to the main.cf file on disk.
Veritas Cluster Server 6.0 for UNIX: Install and Configure
Copyright 2012 Symantec Corporation. All rights reserved.
The startup process is repeated on each system until all members have identical
copies of the cluster configuration in memory and matching main.cf files on
local disks. Synchronization is maintained by data transfer through LLT and GAB.
111
65
Stopping VCS
There are several methods of stopping the VCS engine (had and hashadow
daemons) on a cluster system.
The options you specify to hastop determine where VCS is stopped, and how
resources under VCS control are affected.
VCS shutdown examples
The four examples show the effect of using different options with the hastop
command:
The -all option stops had on all systems and takes the service groups offline.
The -all -force options stop had on both systems and leave the services
running. Although they are no longer protected highly available services and
cannot fail over, the services continue to be available to users.
Use caution with this option. VCS does not warn you if the configuration is
open and you stop using the -force option.
The -local option causes the service group to be taken offline on s1 and
stops the VCS engine (had) on s1.
The -local -evacuate options cause the service group on s1 to be
migrated to s2 and then stop had on s1.
112
66
Configure one of the values shown in the table in the slide for the EngineShutdown
attribute depending on the desired functionality for the hastop command.
113
67
VCS provides several tools and methods for configuring service groups and
resources, generally categorized as:
Online configuration
You can modify the cluster configuration while VCS is running using one of
the graphical user interfaces or the command-line interface. These online
methods change the cluster configuration in memory. When finished, you write
the in-memory configuration to the main.cf file on disk to preserve the
configuration.
Offline configuration
In some circumstances, you can simplify cluster implementation and
configuration using an offline method, including:
Editing configuration files manually
Using the Simulator to create, modify, model, and test configurations
This method requires you to stop and restart VCS in order to build the new
configuration in memory.
114
68
Online configuration
How VCS changes the online cluster configuration
When you use Cluster Manager to modify the configuration, the GUI
communicates with had on the specified cluster system to which Cluster Manager
is connected.
Note: Cluster Manager configuration requests are shown conceptually as ha
commands in the diagram, but they are implemented as system calls.
The had daemon communicates the configuration change to had on all other
nodes in the cluster, and each had daemon changes the in-memory configuration.
115
When the command to save the configuration is received from Cluster Manager,
had communicates this command to all cluster systems, and each systems had
daemon writes the in-memory configuration to the main.cf file on its local disk.
The VCS command-line interface is an alternate online configuration tool. When
you run ha commands, had responds in the same fashion.
Note: When two administrators are changing the cluster configuration
simultaneously, each administrator sees all changes as they are being made.
69
116
610
If you save the cluster configuration after each change, you can view the main.cf
file to see how the in-memory modifications are reflected in the main.cf file.
117
611
118
612
119
613
120 614
Ensure that you understand the VCS startup sequence described in the Starting
and Stopping VCS section before you attempt this type of recovery.
Offline configuration
Characteristics
In some circumstances, you can simplify cluster implementation or configuration
tasks by directly modifying the VCS configuration files. This method requires you
to stop and restart VCS in order to build the new configuration in memory.
121
One consideration when choosing to perform offline configuration is that you must
be logged into the a cluster system as root.
This section describes situations where offline configuration is useful. The next
section shows how to stop and restart VCS to propagate the new configuration
throughout the cluster. The Offline Configuration of Service Groups lesson
provides detailed offline configuration procedures and examples.
615
When using the Cluster Manager to perform administration, you are prompted for
a VCS account name and password. Depending on the privilege level of that VCS
user account, VCS displays the Cluster Manager GUI with an appropriate set of
options. If you do not have a valid VCS account, you cannot run Cluster Manager.
122 616
When using the command-line interface for VCS, you are also prompted to enter a
VCS user account and password and VCS determines whether that user account
has proper privileges to run the command. One exception is the UNIX root user.
By default, only the UNIX root account is able to use VCS ha commands to
administer VCS from the command line.
VCS access in secure mode
When running in secure mode, VCS uses operating system-based authentication,
which enables VCS to provide a single sign-on mechanism. All VCS users are
system and domain users and are configured using fully qualified user names, for
example, administrator@xyz.com.
When running in secure mode, you can add system or domain users to VCS and
assign them VCS privileges. However, you cannot assign or change passwords
using a VCS interface.
Note: The effect of halogin only applies for that shell session.
123
617
124 618
The slide shows how to use the hauser command to create users and set
privileges. You can also add privileges with the -addpriv and -deletepriv
options to hauser.
125
In non-secure mode, VCS passwords are stored in the main.cf file in encrypted
format. If you use a GUI or CLI to set up a VCS user account, passwords are
encrypted automatically. If you edit the main.cf file, you must encrypt the
password using the vcsencrypt command.
Note: In non-secure mode, if you change a UNIX account, this change is not
reflected in the VCS configuration automatically. You must manually modify
accounts in both places if you want them to be synchronized.
Modifying user accounts
Use the hauser command to make changes to a VCS user account:
Change the password for an account.
hauser -update user_name
Delete a user account.
hauser -delete user_name
619
126 620
Labs and solutions for this lesson are located on the following pages.
Lab 6: Starting and stopping VCS, page A-69.
Lab 6: Starting and stopping VCS, page B-141.
Lesson 7
127
128 72
129
73
Identifying components
130 74
131
75
Documenting attributes
In order to configure the operating system resources you have identified as
requirements for an application, you need the detailed configuration information
used when initially configuring and testing services.
You can use a design diagram and worksheet while performing one-time
configuration tasks and testing to:
Show the relationships between the resources, which determine the order in
which you configure, start, and stop resources.
Document the values needed to configure VCS resources after testing is
complete.
132 76
Note: If your systems are not configured identically, you must note those
differences in the design worksheet. The Online Configuration lesson
shows how you can configure a resource with different attribute values for
different systems.
The examples displayed in the slides in this lesson show values for various
operating system platforms, indicated by the icons. In the case of the appsg service
group shown in the slide, the lan2 value of the Device attribute for the NIC
resource is specific to HP-UX. Solaris, Linux, and AIX have other operating
system-specific values, as shown in the respective Bundled Agents Reference
Guides.
133
77
Note: Although examples used throughout this course are based on Veritas
Volume Manager, VCS also supports other volume managers. VxVM is
shown for simplicityobjects and commands are essentially the same on
all platforms. The agents for other volume managers are described in the
Veritas Cluster Server Bundled Agents Reference Guide.
Preparing shared storage, such as creating disk groups, volumes, and file systems,
is performed once, from one system. Then you must create mount point directories
on each system.
The options to mkfs differ depending on platform type, as displayed in the
following examples.
AIX
134 78
135
79
The testing procedure emulates how VCS manages application services and must
include:
Startup: Online
Shutdown: Offline
Verification: Monitor
136 710
The actual commands used may differ from those used in this lesson. However,
conceptually, the same type of action is performed by VCS. Example operations
are described for each component throughout this section.
Verify that shared storage resources are configured properly and accessible. The
examples shown in the slide are based on using Volume Manager.
1 Import the disk group.
2 Start the volume.
3 Mount the file system.
Mount the file system manually for the purposes of testing the application
service. Do not configure the operating system to automatically mount any file
system that will be controlled by VCS.
For example, on Linux systems, ensure that the application file system is not
added to /etc/fstab. VCS must control where the file system is mounted.
Examples of mount commands are provided for each platform.
137
AIX
711
Virtual IP addresses
138 712
The example in the slide demonstrates how users access services through a virtual
IP address that is specific to an application. In this scenario, VCS is managing a
Web server that is accessible to network clients over a public network.
1 A network client requests access to http://eweb.com.
2 The DNS server translates the host name to the virtual IP address of the Web
server.
3 The virtual IP address is managed and monitored by a VCS IP resource in the
Web service group.
The virtual IP address is associated with the next virtual network interface for
e1000g0, which is e1000g0:1 in this example of Solaris network interfaces.
4 The system which has the service group online accepts the incoming request
on the virtual IP address.
Note: The administrative IP address is associated with a physical network
interface on a specific system and is configured by the operating system
during system startup. These are also referred to as base or test IP
addresses.
The diagram in the slide shows what happens if the system running the Web
service group (s1) fails.
1 The IP address is no longer available on the network. Network clients may
receive errors that web pages are not accessible.
2 VCS on the running system (s2) detects the failure and starts the service group.
3 The IP resource is brought online, which configures the same virtual IP address
on the next available virtual network interface alias, e1000g:1 in this
example.
This virtual IP address floats, or migrates, with the service. It is not tied to a
system.
4 The network client Web request is now accepted by the s2 system.
139
Note: The admin IP address on s2 is also configured during system startup. This
address is unique and associated with only this system, unlike the virtual IP
address.
CAUTION
713
Note: These virtual IP addresses are only configured temporarily for testing
purposes. You must not configure the operating system to manage the
virtual IP addresses.
The following examples show the platform-specific commands used to configure a
virtual IP address for testing purposes.
AIX
140 714
Create an alias for the virtual interface and bring up the IP on the next available
logical interface.
ifconfig en1 inet 10.10.21.198 netmask 255.0.0.0 alias
HP-UX
Linux
Plumb the virtual interface and bring up the IP address on the next available
logical interface.
ifconfig e1000g0 addif 10.10.21.198 up
Note: In each case, you can edit /etc/hosts to assign a virtual host name
(application name) to the virtual IP address.
10.10.21.198 eweb.com
141
715
When all dependent resources are available, you can start the application software.
Ensure that the application is not configured to start automatically during system
boot. VCS must be able to start and stop the application using the same methods
you use to control the application manually.
142 716
Verifying resources
You can perform some simple steps, such as those shown in the slide, to verify that
each component needed for the application to function is operating at a basic level.
Note: To test the network resources, access one or more well-known addresses
outside of the cluster, such as local routers, or primary and secondary DNS
servers.
This helps you identify any potential configuration problems before you test the
service as a whole, as described in the Testing the Integrated Components
section.
717
For example, if you have an application with a backend database, you can:
1 Start the database (and listener process).
2 Start the application.
3 Connect to the application from the public network using the client software to
verify name resolution to the virtual IP address.
4 Perform user tasks, as applicable; perform queries, make updates, and run
reports.
144 718
Another example that illustrates how you can test your service uses Network File
System (NFS). If you are preparing to configure a service group to manage an
exported file system, verify that you can mount the exported file system from a
client on the network. This is described in more detail later in the course.
HP-UX
ifdown eth0:1
Solaris
719
Perform the same type of testing used to validate the resources on the initial
system, including real-world scenarios, such client access from the network.
146 720
The slide shows the resource dependency definition for the application used as an
example in this lesson.
721
148 722
Labs and solutions for this lesson are located on the following pages.
Lab 7: Preparing application services, page A-77.
Lab 7: Preparing application services, page B-157.
723
150 724
Lesson 8
Online Configuration
151
152 82
You can use the procedures shown in the diagram as a standard methodology for
creating service groups and resources. Although there are many ways you could
vary this configuration procedure, following a recommended practice simplifies
and streamlines the initial configuration and facilitates troubleshooting if you
encounter configuration problems.
153
83
When deciding upon a naming convention, consider delimiters, such as dash (-)
and underscore (_), with care. Differences in keyboards may prevent use of some
characters, especially in the case where clusters span geographic locations.
154 84
155
85
Note: You can click the Show Command button to see the commands that are
run when you click OK.
Adding a service group using the CLI
You can also use the VCS command-line interface to modify a running cluster
configuration. The next example shows how to use hagrp commands to add the
appsg service group and modify its attributes.
haconf makerw
hagrp add appsg
hagrp modify appsg SystemList s1 0 s2 1
hagrp modify appsg AutoStartList s1 s2
haconf dump -makero
See the command-line reference card provided with this course for a list of
commonly used ha commands.
156 86
Adding resources
Online resource configuration procedure
157
87
Optional attributes for NIC vary by platform. Refer to the Veritas Cluster Server
Bundled Agents Reference Guide for a complete definition. These optional
attributes are common to all platforms.
NetworkType: Type of network, Ethernet (ether)
PingOptimize: Number of monitor cycles to detect if the configured interface
is inactive
A value of 1 optimizes broadcast pings and requires two monitor cycles. A
value of 0 performs a broadcast ping during each monitor cycle and detects the
inactive interface within the cycle. The default is 1.
158 88
NetworkHosts: The list of hosts on the network that are used to determine if
the network connection is alive
It is recommended that you specify the IP address of the host rather than the
host name to prevent the monitor cycle from timing out due to DNS problems.
Example device attribute values:
AIX: en0; HP-UX: lan2; Linux: eth0; Solaris: e1000g0
Persistent resources
If you add a persistent resource as the first resource of a new service group, as
shown in the lab exercise for this lesson, notice that the service group status is
offline, even though the resource status is online.
Persistent resources are not taken into consideration when VCS reports service
group status, because they are always online. When a nonpersistent resource is
added to the group, such as IP, the service group status reflects the status of that
nonpersistent resource.
159
89
Adding an IP resource
The slide shows the required attribute values for an IP resource (on Solaris) in the
appsg service group. The corresponding entry is made in the main.cf file when
the configuration is saved.
Notice that the IP resource on Solaris has two required attributes: Device and
Address, which specify the network interface and virtual IP address, respectively.
The required attributes vary depending on the platform.
160 810
Optional Attributes
NetMask: Netmask associated with the application IP address
The value may be specified in decimal (base 10) or hexadecimal (base 16).
The default is the netmask corresponding to the IP address class.
This is a required attribute on AIX.
Options: Options to be used with the ifconfig command
ArpDelay: Number of seconds to sleep between configuring an interface and
sending out a broadcast to inform routers about this IP address
The default is 1 second.
IfconfigTwice: If set to 1, this attribute causes an IP address to be configured
twice, using an ifconfig up-down-up sequence. This behavior increases the
probability of gratuitous ARPs (caused by ifconfig up) reaching clients.
The default is 0.
Note: As of version 4.1, VCS sets the vxdg autoimport option to no, which
disables autoimporting of disk groups.
161
811
If you have a large number of volumes on a single disk group, the DiskGroup
resource can time out when trying to start or stop all the volumes simultaneously.
In this case, you can set the StartVolume and StopVolume attributes of the
DiskGroup to 0, and create Volume resources to start the volumes individually.
162 812
Also, if you are using volumes as raw devices with no file systems, and, therefore,
no Mount resources, consider using Volume resources for the additional level of
monitoring.
The Volume resource has no optional attributes.
The Mount resource has the required attributes displayed in the main.cf file
excerpt in the slide.
163
813
164 814
Note: The example operating system commands for unmounting a locked file
system are specific to Solaris. Other operating systems may use different
methods for unmounting file systems.
The optional Arguments attribute specifies any command-line options to use when
starting the process.
165
815
166 816
If you are unable to bring a resource online, use the procedure in the diagram to
find and fix the problem. You can view the logs through Cluster Manager or in the
/var/VRTSvcs/logs directory if you need to determine the cause of errors.
VCS log entries are written to engine_A.log and agent entries are written to
resource_A.log files.
167
Note: Some resources must be disabled and reenabled. Only resources whose
agents have open and close entry points, such as MultiNICB, require you to
disable and enable again after fixing the problem. By contrast, a Mount
resource does not need to be disabled if, for example, you incorrectly
specify the MountPoint attribute.
However, it is generally good practice to disable and enable regardless because it is
difficult to remember when it is required and when it is not. In addition, a resource
is immediately monitored upon enabling, which would indicate potential problems
with attribute specification.
More detail on performing tasks necessary for solving resource configuration
problems is provided in the following sections.
817
When you enable a resource, VCS calls the agent to immediately monitor the
resource and then continues to periodically directs the agent to monitor the
resource.
168 818
It is important to clear faults for critical resources after fixing underlying problems
so that the system where the fault originally occurred can be a failover target for
the service group. In a two-node cluster, a faulted critical resource would prevent
the service group from failing back if another fault occurred. You can clear a
faulted resource on a particular system, or on all systems when the service group
can run.
169
Note: Persistent resource faults should be probed to force the agent to monitor the
resource immediately. Otherwise, the resource is not online until the next
OfflineMonitorInterval, up to five minutes.
Clearing and Probing Resources Using the CLI
To clear a faulted resource, type:
hares -clear resource [-sys system]
If the system name is not specified then the resource is cleared on all systems.
To probe a resource, type:
hares -probe resource -sys system
Lesson 8 Online Configuration
Copyright 2012 Symantec Corporation. All rights reserved.
819
170 820
Linking resources
When you link a parent resource to a child resource, the dependency becomes a
component of the service group configuration. When you save the cluster
configuration, each dependency is listed at the end of the service group definition,
after the resource specifications, in the format show in the slide.
In addition, VCS creates a dependency tree in the main.cf file at the end of the
service group definition to provide a more visual view of resource dependencies.
This is not part of the cluster configuration, as denoted by the // comment
markers.
// resource dependency tree
//
Copyright 2012 Symantec Corporation. All rights reserved.
//group appsg
171
//{
//IP appip
//
//
NIC appnic
//
//}
Note: You cannot use the // characters as general comment delimiters. VCS strips
out all lines with // upon startup and re-creates these lines based on the
requires statements in the main.cf file.
821
Resource dependencies
VCS enables you to link resources to specify dependencies. For example, an IP
address resource is dependent on the NIC providing the physical link to the
network.
Ensure that you understand the dependency rules shown in the slide before you
start linking resources.
172 822
You can also run fire drills using the havfd command, as shown in the slide.
173
823
174 824
Note: When you set an attribute to a default value, the attribute is removed from
main.cf. For example, after you set Critical to 1 for a resource, the
Critical = 0 line is removed from the resource configuration because
it is now set to the default value for the resource type.
To see the values of all attributes for a resource, use the hares command. For
example:
hares -display appdg
175
Labs and solutions for this lesson are located on the following pages.
Lab 8: Online configuration of a service group, page A-83.
Lab 8: Online configuration of a service group, page B-167.
825
176 826
Lesson 9
Offline Configuration
177
178 92
You can copy the configuration files from the original cluster, make the necessary
changes, and then restart VCS as described later in this lesson. This method may
be more efficient than creating each service group and resource using a graphicaluser interface or the VCS command-line interface.
179
93
In the example shown in the slide, the portion of the main.cf file that defines the
extwebsg service group is copied and edited as necessary to define a new intwebsg
service group.
180 94
After the cluster configuration is copied to the real cluster and VCS is restarted,
you must perform complete testing of all objects, as shown later in this lesson.
181
95
New cluster
182 96
The diagram illustrates a process for modifying the cluster configuration when you
are configuring your first service group and do not already have services running
in the cluster. Select one system to be your primary node for configuration. Work
from this system for all steps up to the final point of restarting VCS.
1 Save and close the configuration.
Always save and close the configuration before making any modifications.
This ensures the configuration in the main.cf file on disk is the most recent
in-memory configuration.
2 Change to the configuration directory.
The examples used in this procedure assume you are working in the /etc/
VRTSvcs/conf/config directory.
3 Stop VCS.
Stop VCS on all cluster systems. This ensures that there is no possibility of
another administrator changing the cluster configuration while you are
modifying the main.cf file.
4 Edit the configuration files.
You must choose a system on which to modify the main.cf file. You can
choose any system. However, you must then start VCS first on that system.
5 Verify the configuration file syntax.
183
97
Existing cluster
The diagram illustrates a process for modifying the cluster configuration when you
want to minimize the time that VCS is not running to protect existing services.
This procedure includes several built-in protections from common configuration
errors and maximizes high availability.
First system
184 98
Designate one system as the primary change management node. This makes
troubleshooting easier if you encounter problems with the configuration.
1 Save and close the configuration.
Save and close the cluster configuration before you start making changes. This
ensures that the working copy has the latest in-memory configuration.
2 Back up the main.cf file.
Make a copy of the main.cf file with a different name. This ensures that you
have a backup of the configuration that was in memory when you saved the
configuration to disk.
Note: If any *types.cf files are being modified, also back up these files.
3 Make a staging directory.
Make a subdirectory of /etc/VRTSvcs/conf/config in which you can
edit a copy of the main.cf file. This helps ensure that your edits are not
overwritten if another administrator changes the configuration simultaneously.
4 Copy the configuration files.
185
99
Note: The dot (.) argument indicates that the current working directory is used as
the path to the configuration files. You can run hacf -verify from any
directory by specifying the path to the configuration directory:
hacf -verify /etc/VRTSvcs/conf/config
186 910
10
11
12
Stop VCS.
Stop VCS on all cluster systems after making configuration changes. To leave
applications running, use the -force option, as shown in the diagram.
Copy the new configuration file.
Copy the modified main.cf file and all *types.cf files from the staging
directory back into the configuration directory.
Start VCS.
Start VCS first on the system with the modified main.cf file.
Verify that VCS is in a local build or running state on the primary system.
Start other systems.
After VCS is in a running state on the first system, start VCS all other systems.
You must wait until the first system has built a cluster configuration in memory
and is in a running state to ensure the other systems perform a remote build
from the first systems configuration in memory.
Veritas Cluster Server 6.0 for UNIX: Install and Configure
Copyright 2012 Symantec Corporation. All rights reserved.
187
Ensure that VCS builds the new configuration in memory on the system where the
changes were made to the main.cf file. All other systems must wait for the build
to successfully complete and the system to transition to the running state before
VCS is started elsewhere.
1 Run hastart on s1 to start the had and hashadow processes.
2 HAD checks for a valid main.cf file.
3 HAD checks for an active cluster configuration on the cluster interconnect.
4 Because there is no active cluster configuration, HAD on s1 reads the local
main.cf file and loads the cluster configuration into local memory on s1.
5 Verify that VCS is in a local build or running state on s1 using hastatus
-sum.
911
6
7
8
9
10
11
When VCS is in a running state on s1, run hastart on s2 to start the had and
hashadow processes.
HAD on s2 checks for a valid main.cf file.
HAD on s2 checks for an active cluster configuration on the cluster
interconnect.
The s1 system sends a copy of the cluster configuration over the cluster
interconnect to s2.
The s2 system performs a remote build to load the new cluster configuration in
memory.
HAD on s2 backs up the existing main.cf and types.cf files and saves the
current in-memory configuration to disk.
188 912
Resource dependencies
Ensure that you create the resource dependency definitions at the end of the
service group definition. Add the links using the syntax shown in the slide.
189
913
A portion of the completed main.cf file with the new service group definition
for intwebsg is displayed in the slide. This service group was created by copying
the extwebsg service group definition and changing the attribute names and values.
190 914
Use the offline configuration procedure to restart VCS using the recovered
main.cf file.
191
Note: You must ensure that VCS is in the local build or running state on the
system with the recovered main.cf file before starting VCS on other
systems.
915
192 916
7
8
9
10
11
12
When HAD is in a running state on s1, this state change is broadcast on the
cluster interconnect by GAB.
Next, run hastart on s2 to start HAD.
HAD on s2 checks for a valid main.cf file. This system has an old version of
the main.cf.
HAD on s2 then checks for another node in a local build or running state.
Since s1 is in a local build or running state, HAD on s2 performs a remote
build from the configuration on s1.
HAD on s2 copies the cluster configuration into the local main.cf and
types.cf files after moving the original files to backup copies with
timestamps.
193
917
194 918
195
If you need to make additional modifications, you can use one of the online tools
or modify the configuration files using the offline procedure.
919
196 920
Labs and solutions for this lesson are located on the following pages.
Lab 9: Offline configuration, page A-95.
Lab 9: Offline configuration, page B-199.
Lesson 10
Configuring Notification
197
198 102
Notification overview
When VCS detects certain events, you can configure the notifier to:
Generate an SNMP (V2) trap to specified SNMP consoles.
Send an e-mail message to designated recipients.
Message queue
VCS ensures that no event messages are lost while the VCS engine is running,
even if the notifier daemon stops or is not started. The had daemons
throughout the cluster communicate to maintain a replicated message queue.
199
If the service group with notifier configured as a resource fails on one of the nodes,
notifier fails over to another node in the cluster. Because the message queue is
guaranteed to be consistent and replicated across nodes, notifier can resume
message delivery from where it left off after it fails over to the new node.
Messages are stored in the queue until one of these conditions is met:
The notifier daemon sends an acknowledgement to had that at least one
recipient has received the message.
The queue is full. The queue is circularthe last (oldest) message is deleted in
order to write the current (newest) message.
Messages in the queue for one hour are deleted if notifier is unable to deliver to
the recipient.
Note: Before the notifier daemon connects to had, messages are stored
permanently in the queue until one of the last two conditions is met.
103
You can view the entries in the message cue using the haclus -notes
command. You can also delete all queued messages on all cluster nodes using
haclus -delnotes, but the notifier must be stopped first.
200 104
201
105
The administrator can configure notifier to specify which recipients are sent
messages based on the severity level.
202 106
The table in the slide shows how the notifier levels shown in e-mail messages
compare to the log file codes for corresponding events. Notice that notifier
SevereError events correlate to CRITICAL entries in the engine log.
107
Configuring notification
Configuration methods
Although you can start and stop the notifier daemon manually outside of VCS,
you should make the notifier component highly available by placing the daemon
under VCS control.
You can configure VCS to manage the notifier manually using the command-line
interface or Veritas Operations Manager, or set up notification during initial cluster
configuration.
204 108
Notification configuration
These high-level tasks are required to manually configure highly available
notification within the ClusterService group.
1 Add a NotifierMngr type of resource to the ClusterService group.
Link the resource to the csgnic resource that is present
2 If SMTP notification is required:
a Modify the SmtpServer and SmtpRecipients attributes of the NotifierMngr
type of resource.
b Optionally, modify the ResourceOwner attribute of individual resources.
c Optionally, specify a GroupOwner e-mail address for each service group.
3 If SNMP notification is required:
a Modify the SnmpConsoles attribute of the NotifierMngr type of resource.
b Verify that the SNMPTrapPort attribute value matches the port configured
for the SNMP console. The default is port 162.
c Configure the SNMP console to receive VCS traps (described later in the
lesson).
4 Modify any other optional attributes of the NotifierMngr type of resource.
See the manual pages for notifier and hanotify for a complete description
of notification configuration options.
109
The example in the slide shows the configuration of a notifier resource for e-mail
notification.
206 1010
See the Veritas Cluster Server Bundled Agents Reference Guide for detailed
information about the NotifierMngr agent.
Note: Before modifying resource attributes, ensure that you take the resource
offline and disable it. The notifier daemon must be stopped and
restarted with new parameters in order for changes to take effect.
1011
208 1012
These attributes are specified by a list of e-mail addresses along with the severity
level. The registered users get only those events which have severity equal to or
greater than the severity requested. For example, if janedoe is configured in the
ClusterRecipients attribute with a severity level Warning, she would get events
of severity Warning, Error and SevereError but would not get events with
severity Information. A cluster event, such as a cluster fault, which is Error
level, would be sent to janedoe.
1013
SNMP traps sent by VCS are then displayed in the HP OpenView NNM SNMP
console.
210 1014
Overview of triggers
Using triggers
VCS provides an additional method for notifying users of important events. When
VCS detects certain events, you can configure a trigger to notify an administrator
or perform other actions. You can use event triggers in place of, or in conjunction
with, notification.
Triggers are executable programs, batch files, shell or Perl scripts associated with
the predefined event types supported by VCS that are shown in the slide.
211
1015
Sample triggers
A set of sample trigger scripts is provided in /opt/VRTSvcs/bin/
sample_triggers. These scripts can be copied to /opt/VRTSvcs/bin/
triggers and modified to your specifications.
The sample scripts include comments that explain how the trigger is invoked and
provide guidance about modifying the samples to your specifications.
212 1016
Location of triggers
Trigger executable programs, batch files, shell or Perl scripts reside in
/opt/VRTSvcs/bin/triggers by default.
You can change the location of triggers by specifying the TriggerPath attribute at
the service group or resource level. This attribute enables you to set up different
trigger programs for resources or service groups. In previous versions of VCS, the
same triggers applied to all resources or service groups in the cluster.
213
The example portion of the main.cf file shows the PREONLINE trigger enabled
for websg on both s1 and s2, and the trigger path customized to map to
/opt/VRTSvcs/bin/websg.
1017
Example configuration
The slide shows the basic procedure for creating a trigger using a sample script
provided with VCS.
In this case, the resfault script is copied from the sample_triggers
directory and then modified to use the Linux /bin/mail program to send e-mail
to the modified recipients list.
The only changes required to make use of the sample resfault trigger in this
example are the following two lines:
@recipients=("student\@mgt.example.com");
. . .
"/bin/mail -s resfault $recipient < $msgfile";
214 1018
After a trigger is modified, you must ensure the file is executable by root, and then
copy the script or program to each system in the cluster that can run the trigger.
Finally, modify the TriggersEnabled attribute to specify the key for each system
that can run the trigger.
To use multiple files for a single trigger, you must specify a custom path using the
TriggerPath attribute.
215
1019
216 1020
Labs and solutions for this lesson are located on the following pages.
Lab 10: Configuring notification, page A-103.
Lab 10: Configuring notification, page B-215.
Lesson 11
217
218 112
219
The default failover behavior for a service group can be modified using one or
more optional service group attributes. Failover determination and behavior are
described throughout this lesson.
113
220 114
VCS responds in a specific and predictable manner to faults. When VCS detects a
resource failure, it performs the following actions:
1 Instructs the agent to execute the clean entry point for the failed resource to
ensure that the resource is completely offline
The resource transitions to a FAULTED state.
2 Takes all resources in the path of the fault offline starting from the faulted
resource up to the top of the dependency tree
3 If an online critical resource is part of the path that was faulted or taken offline,
faults the service group and takes the group offline to prepare for failover
If no online critical resources are affected, no more action occurs.
4 Attempts to start the service group on another system in the SystemList
attribute according to the FailOverPolicy defined for that service group and the
relationships between multiple service groups
Failover policies and the impact of service group interactions during failover
are discussed in detail in the Veritas Cluster Server for UNIX: Manage and
Administer course.
Note: The state of the group on the new system prior to failover must be
offline (not faulted).
5 If no other systems are available, the service group remains offline.
VCS also executes certain triggers and carries out notification while it performs
each task in response to resource faults. The role of notification and event triggers
in resource faults is explained in detail later in this lesson.
Veritas Cluster Server 6.0 for UNIX: Install and Configure
Copyright 2012 Symantec Corporation. All rights reserved.
221
115
Frozen or TFrozen
These service group attributes are used to indicate that the service group is frozen
due to an administrative command. When a service group is frozen, all agent
online and offline actions are disabled.
If the service group is temporarily frozen using the hagrp -freeze
group command, the TFrozen attribute is set to 1.
If the service group is persistently frozen using the hagrp -freeze group
-persistent command, the Frozen attribute is set to 1.
When the service group is unfrozen using the hagrp -unfreeze group
[-persistent] command, the corresponding attribute is set back to the
default value of 0.
222 116
AutoFailOver
This attribute determines whether automatic failover takes place when a resource
or system faults. The default value of 1 indicates that the service group should be
failed over to other available systems if at all possible. However, if the attribute is
set to 0, no automatic failover is attempted for the service group, and the service
group is left in an OFFLINE | FAULTED state.
117
For example, in case A in the slide, the clean entry point is executed for resource 4
to ensure that it is offline, and resources 7 and 6 are taken offline because they
depend on 4. Because 4 is a critical resource, the rest of the resources are taken
offline from top to bottom, and the group is then failed over to another system.
224 118
119
For more information on attributes that affect failover, refer to the Veritas Cluster
Server Bundled Agents Reference Guide.
226 1110
Adjusting monitoring
You can change some resource type attributes to facilitate failover testing. For
example, you can change the monitor interval to see the results of faults more
quickly. You can also adjust these attributes to affect how quickly an application
fails over when a fault occurs.
MonitorInterval
This is the duration (in seconds) between two consecutive monitor calls for an
online or transitioning resource.
The default is 60 seconds for most resource types.
OfflineMonitorInterval
This is the duration (in seconds) between two consecutive monitor calls for an
offline resource. If set to 0, offline resources are not monitored.
The default is 300 seconds for most resource types.
Refer to the Veritas Cluster Server Bundled Agents Reference Guide for the
applicable monitor interval defaults for specific resource types.
1111
For best results, measure the length of time required to bring a resource online,
take it offline, and monitor it before modifying the defaults. Simply issue an online
or offline command to measure the time required for each action. To measure how
long it takes to monitor a resource, fault the resource, and then issue a probe, or
bring the resource online outside of VCS control and issue a probe.
228 1112
1113
230 1114
If you have determined that a resource can be restarted without impacting the
integrity of the application, you can potentially avoid service group failover by
configuring the RestartLimit, ConfInterval, and ToleranceLimit resource type
attributes.
For example, you can set the ToleranceLimit to a value greater than 0 to allow the
monitor entry point to run several times before a resource is determined to be
faulted. This is useful when the system is very busy and a service, such as a
database, is slow to respond.
Restart example
This example illustrates how the RestartLimit and ConfInterval attributes can be
configured for modifying the behavior of VCS when a resource is faulted.
231
Setting RestartLimit = 1 and ConfInterval = 180 has this effect when a resource
faults:
1 The resource stops after running for 10 minutes.
2 The next monitor returns offline.
3 The ConfInterval counter is set to 0.
4 The agent checks the value of RestartLimit.
5 The resource is restarted because RestartLimit is set to 1, which allows one
restart within the ConfInterval counter
6 The next monitor returns online.
7 The ConfInterval counter is now 60; one monitor cycle has completed.
8 The resource stops again.
9 The next monitor returns offline.
10 The ConfInterval counter is now 120; two monitor cycles have completed.
11 The resource is not restarted because the RestartLimit counter is now 1 and the
ConfInterval counter is 120 (seconds). Because the resource has not been
online for the ConfInterval time of 180 seconds, it is not restarted.
12 VCS faults the resource.
If the resource had remained online for 180 seconds, the internal RestartLimit
counter would have been reset to 0.
1115
232 1116
1117
234 1118
Note: You can also run hagrp -clear group [-sys system] to clear
all FAULTED resources in a service group. However, you have to ensure
that all of the FAULTED resources are completely offline and the faults are
fixed on all the corresponding systems before running this command.
The FAULTED status of a resource is cleared when the monitor returns an online
status for that resource. Note that offline resources are monitored according to the
value of OfflineMonitorInterval, which is 300 seconds (five minutes) by default.
To avoid waiting for the periodic monitoring, you can initiate the monitoring of the
resource manually by probing the resource.
1119
As a response to a resource fault, VCS carries out tasks to take resources or service
groups offline and to bring them back online elsewhere in the cluster. While
carrying out these tasks, VCS generates certain messages with a variety of severity
levels and the VCS engine passes these messages to the notifier daemon.
Whether these messages are used for SNMP traps or SMTP notification depends
on how the notification component of VCS is configured, as described in the
Configuring Notification lesson.
236 1120
The following events are examples that result in a notification message being
generated:
A resource becomes offline unexpectedly; that is, a resource is faulted.
VCS cannot determine the state of a resource.
A failover service group is online on more than one system.
The service group is brought online or taken offline successfully.
The service group has faulted on all nodes where the group could be brought
online, and there are no nodes to which the group can fail over.
1121
238 1122
Labs and solutions for this lesson are located on the following pages.
Lab 11: Configuring resource fault behavior, page A-113.
Lab 11: Configuring resource fault behavior, page B-239.
Lesson 12
239
240 122
IMF overview
Drawbacks of traditional monitoring
The Intelligent Monitoring Framework was created to meet customer demands for
supporting increasing numbers of highly available services. Some environments
are supporting large numbers resources (hundreds of mount points, for example)
running on already loaded systems. With traditional monitoring, VCS agents poll
each resource every 60 seconds, by default, which can add substantially to the
system load in large-scale environments. The periodic nature of traditional
monitoring, coupled with the requirement to run the monitor process for each
resource, results in the state of the resource being unknown between monitor
cycles, and requires additional system resources
123
IMF reduces the VCS CPU footprint and system load, especially in large-scale
clusters with many resources. IMF also provides faster failure detection, and hence
faster failover, of resources, improving availability.
242 124
The AMF module passes the notification to the agent for handling, as described
later in the lesson.
125
In addition, you can create custom agents that use IMF monitoring by linking the
AMF plug-ins with the script agent and creating an XML file to enable registration
with the AMF module. For more information about using IMF monitoring for
custom agents, see the VCS Agent Developers Guide.
244126
IMF configuration
IMF modes
The Mode key of the IMF attribute determines whether IMF or traditional
monitoring is configured for a resource. Accepted values are:
0Does not perform intelligent resource monitoring
1Performs intelligent resource monitoring for offline resources and performs
poll-based monitoring for online resources
2Performs intelligent resource monitoring for online resources and performs
poll-based monitoring for offline resources
3Performs intelligent resource monitoring for both online and for offline
resources
127
The RegisterRetryLimit key defines the number of times the agent tries to register
a resource with the AMF module. If the resource cannot be registered within the
specified number of attempts, intelligent monitoring is disabled and the resource is
monitored using traditional poll-based monitoring.
246 128
Poll-based monitoring is performed by checking the process table for the process
IDs listed in the PidFile.
129
The Oracle resource and second level monitoring are described in detail in the
"Configuring Databases" lesson.
248 1210
Note: This is only required if IMF has been disabled for a resource type.
1211
If you run haimfconfig enable amf, the AMF kernel module is loaded,
but agents are not configured.
250 1212
You can enable or disable IMF for a specific set of agents using the agent
option. See the haimfconfig man page and the Veritas Cluster Server
Administrators Guide for details about changing the IMF configuration.
251
1213
See the Handling Resource Faults lesson for details about how attributes affect
how VCS response to resource faults.
252 1214
For IMF-monitored resources, a fault is detected and the agent probes the resource
immediately to determine the resource state.
When a process dies or hangs, the operating system generates an alert. The agent is
registered to receive such alerts from the operating system, through the AMF
kernel module. The agent then probes the resource to determine the state and
notifies HAD if the resources is faulted. HAD can then take action within seconds
of a resource fault, rather than minutes, as with poll-based monitoring.
1215
The methods for clearing IMF-managed resources are the same resources managed
by agents using traditional poll-based monitoring, as shown in the slide.
254 1216
Labs and solutions for this lesson are located on the following pages.
Lab 12: IMF and AMF, page A-131.
Lab 12: IMF and AMF, page B-285.
1217
256 1218
Lesson 13
Cluster Communications
257
258 132
Atomic means that all systems receive updates, or all systems are rolled back to the
previous state, much like a database atomic commit. If a failure occurs while
transmitting status changes, GABs atomicity ensures that, upon recovery, all
systems have the same information regarding the status of any monitored resource
in the cluster.
VCS on-node communications
VCS uses agents to manage resources within the cluster. Agents perform resourcespecific tasks on behalf of had, such as online, offline, and monitoring actions.
These actions can be initiated by an administrator issuing directives using the VCS
graphical or command-line interfaces, or by other events that require had to take
some action. Agents also report resource status back to had. Agents do not
communicate with one another, but only with had.
The had processes on each cluster system communicate cluster status information
over the cluster interconnect.
133
260 134
261
135
Note: The port a, port b, and port h generation numbers change each time the
membership changes.
262 136
137
node_number name
264 138
LLT uses the set-cluster directive to assign a unique number to each cluster.
A cluster ID is set during installation and can be validated as a unique ID among
all clusters sharing a network for the cluster interconnect.
Note: You can use the same cluster interconnect network infrastructure for
multiple clusters. The llttab file must specify the appropriate cluster ID
to ensure that there are no conflicting node IDs.
If you bypass the installer mechanisms for ensuring the cluster ID is unique and
LLT detects multiple systems with the same node ID and cluster ID on a private
network, the LLT interface is disabled on the node that is starting up. This prevents
a possible split-brain condition, where a service group might be brought online on
the two systems with the same node ID.
139
Note: If the sysname file contains a different name from the llttab/
llthosts/main.cf files, this phantom system is added to the cluster
upon cluster startup.
266 1310
The sysname file can be specified for the set-node directive in the llttab
file. In this case, the llttab file can be identical on every node, which may
simplify reconfiguring the cluster interconnect in some situations.
See the sysname manual page for a complete description of the file.
This example starts GAB and specifies that four systems are required to be running
GAB to start within the cluster. The -n option should always be set to the total
number of systems in the cluster.
A sample gabtab file is included in /opt/VRTSgab.
Note: Other gabconfig options are discussed later in this lesson. See the
gabconfig manual page for a complete description of the file.
1311
268 1312
By default, a system is not seeded when it boots. This prevents VCS from starting,
which prevents applications (service groups) from starting. If the system cannot
communicate with the cluster, it cannot be seeded.
Seeding is a function of GAB and is performed automatically or manually,
depending on how GAB is configured. GAB seeds a system automatically in one
of two ways:
When an unseeded system communicates with a seeded system
When all systems in the cluster are unseeded and able to communicate with
each other
The number of systems that must be seeded before VCS is started on any system is
also determined by the GAB configuration.
When the cluster is seeded, each node is listed in the port a membership displayed
by gabconfig -a. In the following example, all four systems (nodes 0, 1, 2,
and 3) are seeded, as shown by port a membership:
# gabconfig -a
GAB Port Memberships
=======================================================
Port a gen a356e003
membership 0123
/etc/rc.d/rc2.d/S92gab
Calls /etc/gabtab
/etc/rc.d/rc2.d/S99vcs
Runs /opt/VRTSvcs/bin/hastart
/sbin/rc2.d/S680llt
/sbin/rc2.d/S920gab
Calls /etc/gabtab
/sbin/rc2.d/S990vcs
Runs /opt/VRTSvcs/bin/hastart
/etc/rc[2345].d/llt
/etc/rcX.d/gab
Calls /etc/gabtab
/etc/rcX.d/vcs
Runs /opt/VRTSvcs/bin/hastart
HP-UX
Linux
Solaris 10
/lib/svc/method/llt
/lib/svc/method/gab
Calls /etc/gabtab
/lib/svc/method/vcs
Runs /opt/VRTSvcs/bin/hastart
1313
In this case, port a membership is complete, but port h is not. VCS cannot detect
whether a service is running on a system where HAD is not running. Rather than
allowing a potential concurrency violation to occur, VCS prevents the service
group from starting anywhere until all resources are probed on all systems.
270 1314
After all resources are probed on all systems, a service group can come online by
bringing offline resources online. If the resources are already online, as in the case
where HAD has been stopped with the hastop -all -force option, the
resources are marked as online.
Prior to any failures, systems s1, s2, and s3 are part of the regular membership of
cluster number 1. When the s3 system fails, it is no longer part of the cluster
membership. Service group C fails over and starts up on either s1 or s2, according
to the SystemList and FailOverPolicy values.
271
1315
272 1316
When a system faults, application services that were running on that system are
disrupted until the services are started up on another system in the cluster. The time
required to address a system fault is a combination of the time required to:
Detect the system failure.
A system is determined to be faulted according to these default timeout
periods:
LLT timeout: If LLT on a running system does not receive a heartbeat from
a system for 16 seconds, LLT notifies GAB of a heartbeat failure.
GAB stable timeout: GAB determines that a membership change is
occurring, and after five seconds, GAB delivers the membership change to
HAD.
Select a failover target.
The time required for the VCS policy module to determine the target system is
negligible, less than one second in all cases, in comparison to the other factors.
Bring the service group online on another system in the cluster.
As described in an earlier lesson, the time required for the application service
to start up is a key factor in determining the total failover time.
Manual seeding
You can override the seed values in the gabtab file and manually force GAB to
seed a system using the gabconfig command. This is useful when one of the
systems in the cluster is out of service and you want to start VCS on the remaining
systems.
To seed the cluster if GAB is already running, use gabconfig with the -x
option to override the -n value set in the gabtab file. For example, type:
gabconfig -x
If GAB is not already started, you can start and force GAB to seed using -c and
-x options to gabconfig:
gabconfig -c -x
CAUTION
Only manually seed the cluster when you are sure that no other
systems have GAB seeded. In clusters that do not use I/O fencing,
you can potentially create a split brain condition by using
gabconfig improperly.
After you have started GAB on one system, start GAB on other systems using
gabconfig with only the -c option. You do not need to force GAB to start with
the -x option on other systems. When GAB starts on the other systems, it
determines that GAB is already seeded and starts up.
1317
The only change is that other systems prevented from starting service groups on
system fault. VCS continues to operate as a single cluster when at least one
network channel exists between the systems.
274 1318
In the example shown in the diagram where one LLT link fails:
A jeopardy membership is formed that includes just system s3.
System s3 is also a member of the regular cluster membership with systems s1
and s2.
Service groups A, B, and C continue to run and all other cluster functions
remain unaffected.
Failover due to a resource fault or an operator request to switch a service group
is unaffected.
If system s3 now faults or its last LLT link is lost, service group C is not started
on systems s1 or s2.
If an application starts on multiple systems and can gain control of what are
normally exclusive resources, such as disks in a shared storage device, split brain
condition results and data can be corrupted.
1319
When the low-priority link is the only remaining LLT link, LLT switches all
cluster status traffic over the link. Upon repair of any configured link, LLT
switches cluster status traffic back to the high-priority link.
276 1320
Notes:
Nodes must be on the same public network segment in order to configure lowpriority links. LLT is a non-routable protocol.
You can have up to eight LLT links total, which can be a combination of lowand high-priority links. For example, if you have three high-priority links, you
have the same progression to jeopardy membership. The difference is that all
three links are used for regular heartbeats and cluster status information.
1321
For example, if you added a system to a running cluster, you can change the value
of -n in the gabtab file without having to restart GAB. However, if you added
the -j option to change the recovery behavior, you must either restart GAB or
execute the gabtab command manually for the change to take effect.
278 1322
Similarly, if you add a host entry to llthosts, you do not need to restart LLT.
However, if you change llttab, or you change a host name in llthosts, you
must stop and restart LLT, and, therefore, GAB.
Following this procedure ensures that any type of changes take effect. You can also
use the scripts in the rc*.d directories to stop and start services.
1323
280 1324
Labs and solutions for this lesson are located on the following pages.
Lab 13: Cluster communications, page A-137.
Lab 13: Cluster communications, page B-299.
Veritas Cluster Server 6.0 for UNIX: Install and Configure
Copyright 2012 Symantec Corporation. All rights reserved.
Lesson 14
281
282 142
For example, in the diagram, if the system on the right fails, it stops sending
heartbeats over the private interconnect. The left node then takes corrective action.
Failure of the cluster interconnect presents identical symptoms. In this case, both
nodes determine that their peer has departed and attempt to take corrective action.
This can result in data corruption if both nodes are able to take control of storage in
an uncoordinated manner.
Other scenarios can cause this situation. If a system is so busy that it appears to be
hung, to another system in the cluster it would seem to have failed. The second
system would then take the corrective action of starting the services of the hung
system. This can also happen on systems where the hardware supports a break and
resume function. If the system is dropped to command-prompt level with a break
and subsequently resumed, the system can appear to have failed. The cluster is
reformed and then the system recovers and begins writing to shared storage again.
143
I/O fencing
The key to protecting data in a shared storage cluster environment is to guarantee
that there is always a single consistent view of cluster membership. In other words,
when one or more systems stop sending heartbeats, the HA software must
determine which systems can continue to participate in the cluster membership and
how to handle the other systems.
VCS uses a mechanism called I/O fencing to guarantee data protection. I/O
fencing uses SCSI-3 persistent reservations (PR) to fence out data drives to prevent
the data loss consequences of split-brain condition. Fencing ensures that data
protection is the highest priority concern, stopping running systems when
necessary to ensure systems cannot starts services when a split-brain condition is
encountered, as described in detail in this lesson.
284 144
SCSI-3 PR supports multiple nodes accessing a device while at the same time
blocking access to other nodes. Persistent reservations are persistent across SCSI
bus resets and also support multiple paths from a host to a disk.
Coordinator disks
The coordinator disks act as a global lock mechanism used by the fencing driver to
determine which nodes are currently registered in the cluster. This registration is
represented by a unique key associated with each node that is written to the
coordinator disks. In order for a node to access a data disk, that node must have a
key registered on coordinator disks.
When system or interconnect failures occur, the coordinator disks enable the
fencing driver to ensure that only one cluster survives, as described in the I/O
Fencing Operations section.
145
Data disks
Data disks are standard disk devices used for shared data storage. These can be
physical disks or RAID logical units (LUNs). These disks must support SCSI-3
PR. Data disks are incorporated into standard VM disk groups. In operation,
Volume Manager is responsible for fencing data disks on a disk group basis.
Disks added to a disk group are automatically fenced, as are new paths to a device
as they are discovered.
286 146
For example, in a cluster with an ID of 8, node 0 uses key VF000800, node 1 uses
key VF000801, node 2 is VF000802, and so on. For simplicity, these are shown
as 0 and 1 in subsequent diagrams.
Note: The registration key is not actually written to disk, but is stored in the drive
electronics or RAID controller.
147
Registration keys for data disks are also based on the LLT node number. Each key
is eight characters (bytes), specified as follows:
The first byte (left-most character) is the LLT node number added to the
hexidecimal number A.
For example, the first byte for LLT node 0 is formed by adding hexadecimal A
to 0, which yields A.
The first byte of LLT node 1 is hexadecimal A plus 1, which yields B.
The next three bytes are the ASCII characters VCS, indicating the keys are
written by the VCS fencing driver.
The final four bytes are null.
288 148
As shown in the table in the slide, node 0 uses key AVCS, node 1 uses key BVCS,
node 2 would be CVCS, and so on. For simplicity, these are shown as A and B in
the diagram.
After registering with the data disks, a Write Exclusive Registrants Only
reservation is set on the data disk. This reservation means that only the registered
system can write to the data disk.
All systems are aware of the keys of all other systems, forming a membership of
registered systems. This fencing membershipmaintained by way of GAB port
bis the basis for determining which nodes have access to the data disks. When
the fencing membership is complete, the fencing driver signals HAD and HAD can
then start building the cluster configuration.
149
At this point, HAD is initialized on each system and one system starts building the
cluster configuration. When HAD is running and all systems have the cluster
configuration in memory, VCS brings service groups online according to their
specified startup policies. When a disk group resource associated with a service
group is brought online, the Volume Manager disk group agent (DiskGroup)
imports the disk group and writes a SCSI-3 registration key to the data disks. This
registration is performed in a similar way to coordinator disk registration.
290 1410
In the example shown in the diagram, node 0 is registered to write to the data disks
in the disk group belonging to the dbsg service group. Node 1 is registered to write
to the data disks in the disk group belonging to the appsg service group.
After registering with the data disk, Volume Manager sets a Write Exclusive
Registrants Only reservation on the data disk.
System failure
291
1411
292 1412
Interconnect failure
The diagram shows how VCS handles fencing if the cluster interconnect is severed
and a network partition is created. In this case, multiple nodes are racing for
control of the coordinator disks.
1 LLT on node 0 informs GAB that it has not received a heartbeat from node 1
within the timeout period. Likewise, LLT on node 1 informs GAB that it has
not received a heartbeat from node 0.
2 When the fencing drivers on both nodes receive a cluster membership change
from GAB, they begin racing to gain control of the coordinator disks.
The node that reaches the first coordinator disk (based on disk serial number)
ejects the failed nodes key. In this example, node 0 wins the race for the first
coordinator disk and ejects the VF000801 (shown as 1 in the diagram) key.
After the B key is ejected by node 0, node 1 cannot eject the key for node 0
because the SCSI-PR protocol says that only a member can eject a member.
This condition means that only one system can win.
3 Node 0 also wins the race for the second coordinator disk.
Node 0 is favored to win the race for the second coordinator disk according to
the algorithm used by the fencing driver. Because node 1 lost the race for the
first coordinator disk, node 1 has to sleep for one second (default) before it
tries to eject the other nodes key. This favors the winner of the first
coordinator disk to win the remaining coordinator disks. Therefore, node 1
does not gain control of the second or third coordinator disks.
1413
After node 0 wins control of the majority of coordinator disks (all three in this
example), node 1 loses the race and calls a kernel panic to shut down
immediately and reboot.
5 Now port b (fencing membership) shows only node 0 because node 1 keys
have been ejected. Therefore, fencing has a consistent membership and passes
the cluster reconfiguration information to HAD.
6 GAB port h reflects the new cluster membership containing only node 0, and
HAD now performs the defined failover operations for the service groups that
were running on the departed system.
When a service group is brought online on a surviving system, fencing takes
place as part of the disk group importing process.
4
As demonstrated in the example failure scenarios, I/O fencing behaves the same
regardless of the type of failure:
The fencing drivers on each system race for control of the coordinator disks
and the winner determines cluster membership.
Reservations are placed on the data disks by Volume Manager when disk
groups are imported.
294 1414
Majority clusters
The I/O fencing algorithm is designed to give priority to larger clusters in any
arbitration scenario. For example, if a single node is separated from a 16-node
cluster due to an interconnect fault, the 15-node cluster should continue to run. The
fencing driver uses the concept of a majority cluster. The algorithm determines if
the number of nodes remaining in the cluster is greater than or equal to the number
of departed nodes. If so, the larger cluster is considered a majority cluster. The
fencing driver gives the majority cluster advantage for winning the race for the
coordinator disks.
Fencing can be configured to override this default behavior and designate certain
nodes as being the preferred racing winners. See the Veritas Cluster Server
Administrators Guide for information on configuring preferred fencing.
1415
Each path to a drive represents a different I/O path. I/O fencing in VCS places the
same key on each path. For example, if node 0 has four paths to the first disk
group, all four paths have key AVCS registered. Later, if node 0 must be ejected,
VxVM preempts and aborts key AVCS, effectively ejecting all paths.
296 1416
Because VxVM controls access to the storage, adding or deleting disks is not a
problem. VxVM fences any new drive added to a disk group and removes keys
when drives are removed. VxVM also determines if new paths are added and
fences these, as well.HAD starts service groups.
Alternately, you can use the offline configuration method and manually make the
change in the main.cf file, but this also means you must manually configure all
other components.
1417
The coordinator disks can be any three disks that support persistent reservations.
Symantec typically recommends using small LUNs (at least 150 MB) for
coordinator use. Using LUNs of at least 150 MBs ensures that:
Certain array technologies interpret the LUN as a data device, not an internal
(gatekeeper) device.
Sufficient space is available for SCSI-3 support testing so that the private
region does not fill the disks.
298 1418
You cannot use coordinator disks for any other purpose in the VCS configuration.
Do not store data on these disks or include the disks in disk groups used for data.
The data would not be protected and would interfere with the fencing process.
Using the coordinator=on option to vxdg for the coordinator disk group
ensures that the coordinator disk group has exactly three disks. This flag is set by
default when fencing is configured using the installer.
DMP support
VCS supports dynamic multipathing for both data and coordinator disks. The
/etc/vxfenmode file is used to set the mode for coordinator disks and these
sample files are provided for configuration:
/etc/vxfen.d/vxfenmode_scsi3_dmp
/etc/vxfen.d/vxfenmode_scsi3_raw
/etc/vxfen.d/vxfenmode_scsi3_sanvm
/etc/vxfen.d/vxfenmode_scsi3_disabled
/etc/vxfen.d/vxfenmode_scsi3_cps (for customized mode)
The following example shows the vxfenmode file contents for a DMP
configuration:
vxfen_mode=scsi3
scsi3_disk_policy=dmp
1419
Fencing can be configured using the CPI installer with the -fencing option.
This enables you to configure fencing without manually modifying configuration
files.
300 1420
301
1421
302 1422
You can also use the -R, -r, options to view registrations.
Replacing coordinator disks
You can replace a coordinator disk using the vxfenswap command. See the Veritas
Cluster Server Administrators Guide for detailed information.
If you inadvertently use reboot to shut down, you may see a message about a
pre-existing split brain condition when you try to restart the cluster. In this case,
you can use the vxfenclearpre utility described in the Veritas Cluster Server
Administrator's Guide.
1423
304 1424
Labs and solutions for this lesson are located on the following pages.
Lab 14: Configuring SCSI3 disk-based I/O fencing, page A-147.
Lab 14: Configuring SCSI3 disk-based I/O fencing, page B-327.
Lesson 15
305
306 152
Coordination points
The original implementation of I/O fencing for data protection supported only
SCSI 3 disk-based fencing using persistent reservations.
In VCS 5.1, server-based fencing was introduced to provide an additional
mechanism for fencing membership arbitration. The term coordination point refers
generically to any disk- or server-based object used to register coordinator keys.
Other terminology introduced for server-based fencing includes coordination point
(CP) client clusters, CP client nodes, and CP servers.
153
Use cases
One key use case for CPS-based fencing is for supporting campus, or stretch,
clusters. A campus cluster is a single cluster with nodes placed in separate
geographical locations to protect against environmental disruptions. This provides
cost-effective disaster recovery solution when a server-based coordination point is
placed in a location separate from the storage arrays.
Another key use case is enterprise-scale configurations with large numbers of
clusters. Implementing CP servers can greatly reduce the number of SCSI 3 PRcompliant LUNs required for fencing.
308 154
155
310 156
The diagram in the slide shows a configuration where the CP server is a two-node
VCS cluster. In this configuration:
Fencing is configured using SCSI 3 coordinator disks to protect shared storage.
The CPS database is located on shared storage.
The CPSSG service group is managing the shared storage resources, as well as
the networking resources used by CP client clusters to connect over the public
network.
311
The table shows a conceptual view of the contents of the CPS database. Clusters
are identified by name and a cluster universal unique identifier (UUID) number.
Cluster nodes are associated with clusters by way of the UUID and also identified
by name. When registered, cluster nodes are assigned the value of 1. Unregistered
nodes have a value of 0 in the Registered field.
157
CP client clusters have the same requirements as VCS or SFHA with all packages
installed. There is no additional license needed on a CP client cluster to use a CPS
coordination point.
312 158
313
159
CPS operations
Arbitration with coordination points
Fencing behavior in a customized- or CPS-only configuration is logically the same
as SCSI 3-based fencing.
Upon startup, CP client cluster nodes register with the CP server to become a
member of an active cluster. The fencing driver on client nodes joins the fencing
membership on GAB port b. When all nodes are included in the port b
membership, the fencing driver notifies HAD to form port h membership.
314 1510
When a fencing event occurs, client cluster nodes race to preempt the keys of other
nodes. In the case of a CPS coordination point, the node preempts the losing node
registration by way of the cpsadm utility. When two coordination points are
registered to the winning node, the fencing driver on the losing node panics the
node. The losing nodes are sometimes referred to as victim nodes.
Script-based fencing
A customized- or CPS-based fencing configuration is said to be using script-based
fencing. A disk-only based fencing configuration uses only the vxfen driver, the
same as legacy fencing in earlier VCS versions.
With script-based fencing, the vxfen driver still manages GAB memberships.
When a membership change occurs, vxfen notifies the new vxfend fencing
daemon.
315
1511
The CP server registrations do not have keys. When CP client nodes register with a
CP server, the CPS database keeps track of which nodes are registered. However,
for purposes of illustration, the 0 and 1 shown on CP 3 indicate that nodes 0 and 1
are registered.
316 1512
This occurs after the coordinator point registration has completed and VCS is
started. Service groups with AutoStartList configured are automatically brought
online during VCS startup. When DiskGroup resources are brought online, disk
groups are imported and Volume Manager writes the reservation keys to all disks
in the disk group.
317
The key is formed by adding hexadecimal A (decimal 10) to the LLT node ID to
create the ASCII character for the first byte of the key. The node with LLT node ID
of 0 writes AVCS keys to the data disks. LLT node ID of 1 writes BVCS keys, and
so on.
1513
System failure
318 1514
Data disk fencing takes place when a service group is brought online on a
surviving system as part of the disk group importing process. When the DiskGroup
resources come online, the agent online entry point instructs Volume Manager to
import the disk group with options to remove the node 1 registration and
reservation, and place a SCSI-3 registration and reservation for node 0.
319
1515
Interconnect failure
As shown in the slide, the same type of procedure is performed in the case of a
network partition, where all links of the cluster interconnect fail simultaneously.
The difference in this case is that both nodes are racing for the coordination points.
In this example, node 0 again wins the race and ejects the registration keys for
node 1 from the coordination points, two of which are disks and one is a CP server.
When node 1 loses the race, the fencing driver panics the system, causing appsg to
fail over to node 0. The appsg disk group is imported on node 0 and Volume
Manager writes AVCS registrations on the disks and places a WERO reservation.
320 1516
High availability management of the CPS components is configured when you set
up CPS.
321
1517
You cannot specify a fully-qualified host name for the CPS name, even if you have
DNS configured with a FQHN.
322 1518
The vxcpserv process is dependent on shared storage for the CPS database (in a
multinode CPS cluster) as well as the public network connection for access to CP
client cluster nodes.
1519
Example /etc/vxfenmode
The vxfenmode file contains new directives to specify new modes and
mechanisms:
scsi: All disks coordination points
CPS: Only CP servers (CPS)
customized: A combination of CPS and disks as coordination points
The slide shows the contents of a sample vxfenmode file for a CP server cluster
with a disk-based fencing policy using dynamic multipathing.
324 1520
Note: This vxfenmode file is similar to example shown in the Data Protection
Using SCSI 3-Based Fencing lesson, because both are disk-only fencing
configurations. Keep in mind this refers to fencing on the CPS cluster, not
the client cluster, which is shown later in this lesson.
Recall that in disk-only fencing configurations, the name of the disk group is not
present in the vxfenmode file. Instead, the disk group name is included in the
legacy vxfendg file.
1521
If the number of IP resources online is lower than the value specified in the
Quorum attribute, the resource faults, causing the CPSSG service group to fail
over and bring the virtual IP addresses online on another node, in a multinode
cluster.
326 1522
As with disk-based fencing, you can use the -fencing option to installvcs
or installsfha to configure customized- or CPS-based fencing on client
clusters. You must have the CP server name or virtual IP address to enable the
utility to access the CP server to add the client cluster nodes and user accounts to
the CPS database.
Note: The CP server name is not the UNIX host name of the system on which the
CP server cluster is running. It is a virtual name that is configured when
CPS is configured.
The CP server name, fully-qualified host name (FQHN), and virtual IP should be
added to DNS so all client clusters can access the CP server by name. This enables
you to use a name in the client fencing configuration. The advantage is that if the
IP address for the CP server must be changed in the future, the client
configurations are not required to change. These values are located in the
vxcps.conf file on the CP server cluster
1523
328 1524
The wizard then creates the vxfenmode file based on your selections and restarts
fencing and VCS. A date-stamp version of the vxfenmode file is created as a
historical record of changes in the fencing configuration and can be used for
troubleshooting.
Finally, the vxfentab file is created listing the coordination points. In disk-only
fencing configurations, the file has the same contents as a legacy VCS vxfentab
file. In a customized fencing configuration, both the disks and CP server
information is present.
1525
The other two coordination points are SCSI 3 PR-compliant disks in a shared
storage device.
330 1526
The slide shows samples of the other two configuration files on the CP client
cluster nodes:
The vxfentab file shows a customized mode fencing configuration with two
coordinator disks and a CP server with two virtual IP addresses.
Security is configured, and the CP server is not a single node cluster.
The clusuuid file contains the universally unique cluster ID.
331
1527
Volume Manager does not allow this flag to be set for a coordinator disk group
containing less than three coordinator disks. Therefore, with customized fencing
configurations with only one or two disks in the fencing disk group, you cannot set
the coordinator flag.
332 1528
CPS administration
CPS user types and privileges
The CPS server requires a user account with admin privileges for each client
cluster node. These user accounts are added to the CP server when customized
fencing is configured on client clusters.
The admin user account is necessary to register and unregister nodes during
fencing operations. You must also use an account with admin privileges to create
snapshots of the CPS database for backup purposes.
The operator and guest privileges are used when cpsadm is run on client cluster
nodes to test connectivity and display CP server objects. This also enables nonroot
users on the CP server to perform some CPS tasks.
1529
For example, if a CP server is not accessible and a client node leaves a cluster, the
registration is not automatically removed on that CP server. When that CP server
starts again, it has a stale registration and that client node cannot rejoin its cluster
until that stale registration is removed.
334 1530
However, in the case where you are manually configuring fencing, you can use
cpsadm to add clusters and nodes to the CPS configuration.
1531
You can also change privileges for CPS user accounts. You can use cpsadm to
enable nonroot users on the CP server to perform admin-level operations.
336 1532
You may want to set up a cron job to periodically create snapshots of the database.
1533
Recall that when changing from pure disk-based fencing to customized or serverbased fencing, you must first disable fencing if you want to reuse the same disk
group. If you use a new coordinator disk group, you can change from disk-based to
customized fencing without disabling fencing first.
338 1534
1535
340 1536
By default, the CoordPoint resource type has the FaultTolerance attribute set to 1.
This means that the resource does not fault when the first key is discovered to be
missing. If you change FaultTolerance to 0, the coordpoint resource faults when
the first missing key is discovered.
Recall, the client cluster is only affected by missing keys when fencing operations
occur. The resource fault is an indicator that a problem is affecting registrations
and you can take preventative actions to restore keys within the running cluster.
The coordpoint resource faults if a coordination point is not accessible, which can
be configured to notify users, alerting them to a problem in the fencing
environment.
Note: Caching is configurable and can be disabled if 5.1 behavior is desired.
1537
342 1538
Labs and solutions for this lesson are located on the following pages.
Lab 15: Configuring CPS-based I/O fencing, page A-167.
Lab 15: Configuring CPS-based I/O fencing, page B-387.
Veritas Cluster Server 6.0 for UNIX: Install and Configure
Copyright 2012 Symantec Corporation. All rights reserved.