Professional Documents
Culture Documents
High Availability
for IBM iTM
3 26/09/2014
Note: Quick-EDD/HA is meant to be used only by technicians and so, to be user
friendly, it is possible to access all the product functionalities thanks to any
entry point.
The product is built around a unique screen displaying the overview of the
replication properties and allowing the necessary access to the setup and
supervision.
Quick-EDD/HA is a software made to provide security and continuity for the IBM i
environment. The product allows you to replicate easily and safely all events from a source
production system to a remote target system using a communication link.
Based on the continuous monitoring of the journals, Quick-EDD/HA replicates, on a remote
site, all the modifications made on the objects and data of the users’ libraries and the
configuration and security components; IFS or Spools are also managed by Quick-EDD/HA.
Event detection
The first step is a function to monitor all journals corresponding to the environment
parameters for any kind of transaction (i.e. creation, modification, deletion) related to any
defined object (i.e. data, PGM, user profiles, configuration, spools, IFS …)
Communication
The second step is replication in real time of any detected modification on the source system.
The speed of the exchange is determined by the type of object and event: I/O transaction,
object transfer or data synchronization.
Processing
On the target system, the detected event from the source system is processed immediately
in the same way. Depending on the type of event, Quick-EDD/HA has to manage one or
several data updates, perform an object restoration, or run a command in order to replicate
the modification (i.e. change of a password, system value…).
Acknowledgement
To ensure a complete process and a perfect completion of transactions, every transaction
will be acknowledged in real time. The acknowledgement validates the whole process and
allows you to retrieve information for history and statistics.
5 26/09/2014
1.1. General Diagram
Data
OBJ
IFS
System SPOOL Event detection
PARAMETRAGE step
DB
AUDIT Journal DB Journal
Journal DB Journal
Journal
Management
SUPERVISION
TCP/IP
Communications Communication
towards target
system
OBJ
SYS
SPOOL
Event integration
step
IFS
DATA I/O, RSTxxx,
command …
TARGET SYSTEM
Destination Libraries, IFS,
Spools …
Acknowledgement from
target to source
1.3. Prerequisites
Quick-EDD/HA does not require either a specific feature of the IBM i nor specific model, but
needs TCP/IP communication features for connection between the systems.
Quick-EDD/HA uses the communication line configured on your system; it does not
automatically create, and does not manage, this configuration.
Note: Please check with your reseller on the compatibility of the Quick-EDD/HA
with the OS/400 of the two systems.
Note: in case the source system runs on V5R3 or V5R4 and the target system runs
a V6Rx or V7Rx, the systems in V5R3 or V5R4 must have the appropriate
PTFs to accept V6Rx or V7Rx objects.
Each OS release has specific PTFs for that purpose.
If those PTFs are installed, switch is not an issue at all.
7 26/09/2014
2. DIFFERENT MODES OF QUICK-
EDD/HA
Quick-EDD/HA can be used in 5 different ways:
« HIGH AVAILABILITY » mode allows 100% of the functionality
« DATA » mode limits the replication to a subset of object types
« LOCAL » mode replicates inside the same system (same partition)
« VERIFY MODE » only compares the two systems and reports back on differences
« SRS » mode allows replication to be active and keep sending the journal entries to
the target system, but the apply process of those journal entries is paused; for example
in order to perform daily backup on target system instead of source system
The “DATA” mode must be activated at “General parameters” level, in order to correspond
with the license key:
DATA replication mode 1
Note: The license key will automatically check for selection of objects
Note: Replication in “LOCAL” mode can be done in both HA and DATA modes
9 26/09/2014
Note: If you don’t use EDHSRSOPT, by default, the SRS process is limited to store
261k journal entries on the target system.
For “DATA” mode, settings lines for IFS, SPOOLS and SYSVAL will disappear
For “DATA” or “LOCAL” mode, dashes can be displayed in RED if some settings don’t
match the mode
« SRS » mode, when *NOIO is activated; that is to say when the apply process of journal
entries on the target system is paused:
11 26/09/2014
2.7. Counters
---- Synchro ----
Wait => Number of objects waiting for synchronization
Ac => Number of objects currently being synchronized
ok => Number of synchronized objects
nok => Number of objects in synchronization error
Wait
hld => Journal entries that are held by Quick-EDD/HA.
When, for example, a synchronization is running, all the new journal entries of the object are
held, in order to be applied at the end of the synchronization (last step of the
synchronization).
rep => Journal entries sent to the target and not acknowledged yet
Nbr obj
nbr obj => Number of elements inside the sub menu
The audit journal can receive a large amount of entries. Therefore, its management
has to be defined with care.
When the replication is started for the first time, auditing is activated on every object;
this action may impact your backup if you are using a SAVCHGOBJ command (all
objects may be modified, so they will be saved).
3.2. Journaling
Quick-EDD/HA will dynamically start journaling on files and other special objects (DTAQs,
IFS …) depending on the defined settings (see EDD_HA_Settings.pdf).
Because of the large number of objects, it may be judicious to split the different applications
and environments into different journals. This will avoid reaching system limits, and makes
the system easier to maintain and manage.
It is also possible to process files without journaling in specific cases:
Files which are never updated and kept sometimes for historical purposes, such as EDI
files
Archives
For those special cases, use specific management functions, which allow you to manage
files like non-database objects (See Non journaled files in EDD_HA_Settings.pdf).
13 26/09/2014
3.4. Receivers management
The HA process is fully based on journaling. Quick-EDD/HA allows you to manage journals
and journal receivers, their detachment, and their deletion, depending upon the settings you
choose.
We have to confirm that journals and journal receivers are not used by some other
function in the system or by another software application. If existing journals are
used for an application, you do not want Quick-EDD/HA to delete the receivers
before the other application has completed their processing.
Depending on the needs and the configurations (OS/400 release level), you will
have to setup carefully the receivers management, see EDD_HA_Journaling.pdf
Minimize active jobs in the system. Quick-EDD/HA dynamically controls the number of
jobs depending on activity.
o Only one master job ensures the follow-up of all journals and the real-time replication
o Server jobs, dynamically managed, are charged with synchronization. The minimum
and maximum numbers of server jobs are defined in the settings.
15 26/09/2014
APPLICATIONS
APPLICATIONS
IFS DTAQs ???
APPLICATIONS
APPLICATIONS
DB JOURNAL
DB JOURNAL
DB JOURNAL D IFS Journal DTAQs Journal Vx.Rx Journal
DB JOURNAL
SETTINGS
QAUDJRN Quick-EDD/HA
PROCESSING
AUDIT Journal SUPERVISOR
Données (EDD) SAV/RST
DATA (I/Os) SAV/RST
The previous schema shows the general organization of Quick-EDD/HA:
The settings part depends on your applications and your journals (currently used or
to be created during the implementation of Quick-EDD/HA).
Groups are defined by the customer or the assigned software engineer, depending
on replication rules defined previously for the system.
They are used to make parameters easy to follow, understand and maintain.
The journal list is automatically built by Quick-EDD/HA directly from file descriptions,
and depending on the groups settings linked with automatic journaling activation.
The supervisor reads all the journals AT THE SAME TIME, including the audit journal
QAUDJRN.
This functionality allows the processing of all required system events, in the correct
sequence, on the target system.
The supervisor dispatches the different transactions, depending on the type of
function to process.
DATABASE replication – processing of input/output transactions on records
(CRT/UPD/DLT) and some commands on members (INZPFM, CLRPFM …)
Commands SAV/RST for new objects or important modifications that need to
recreate the object(s) on the target system
Commands that will reproduce a modification (change password, object modification
…) on the target system
Manage system values
Send/delete spool files
Send information from jobs running on the source system
17 26/09/2014
4.2. Glossary
The documentation of Quick-EDD/HA respects the terms usually used in IBM i
environments.
The list below details the vocabulary specific to Quick-EDD/HA.
Environment
The environment, in Quick-EDD/HA context, corresponds to a setup table, gathering all the
settings to replicate to an IBM i server or partition.
Environment objects have been created to make it possible to differentiate between
different applications, and/or different distribution areas (application environments, multiple
remote sites …). This distinction makes the management easier and more logical, and
expands the maintenance capability.
Environment configuration is stored in an IBM i object, of type *USRIDX, which will be
created in the library PMEDHUSR. The object name is xx_SND on the source system (xx
represents the environment name), and xx_RCV on the target system.
Note: depending on the amount of set information, the *USRIDX may be
followed by *USRSPC, named xx_SNDnnn on the source and xx_RCVnnn on
the target; nnn ranges from 001 to 255. In that case, all objects xx_S* (or xx_R*) are
the environment.
*USRIDX xx_SND and xx_RCV are exactly identical on the source and target system.
So are *USRSPC XX_SNDnnn and XX_RCVnnn.
All other XX* objects of PMEDHUSR are linked to the activity with:
- Pointers on current positions
- Storage of journal entries to manage different functions:
Running entries if a commit cycle is opened
Storing new journal entries during a synchronization
SRS: Smart Remote Staging
Each time the environment is started all RCV objects are recreated during the "Send
parameters" step. So, if you accidentally delete the objects on the target, it is not an issue.
Any missing object will be recreated when the replication is started, including the object
XX_MSG containing the replication messages.
Warning: if there are stored journal entries on the target - objects xx_RCO* - and if
you delete those objects, it won’t be possible to restart the environment by
processing the residual journal entries; see EDD_HA_Supervision for more details
Site
For Quick-EDD/HA a site is a target system.
Usually, a site corresponds to a network node name or a TCP/IP address. Quick-EDD/HA
uses the site to establish communication between source and target systems.
The group is also very important to organize your replication. Logically organized groups
make your installation easier to maintain and modify in the future.
19 26/09/2014
4.3. Access to the menus
The standard command to enter Quick-EDD/HA is "PMEDH". This command, without
parameters, displays the main menu. Note: The menu may differ from this document
depending upon the release level of the software.
QSLFRA2 Q U I C K - E D H 13/03/10
GODEC PMsoft Inside 15:44:43
_____________________________________________________________________________
High Availability www.pmsoft.net V100205
www.quick-software-line.com
_____________________________________________________________________________
Select one of the following options:
7. Commands
8. Messages
9. Object
Option, or command
===>_________________________________________________________________________
F3=Exit F4=Prompt F9=Recall F12=Cancel F13=PMsoft system
Other functions
List of Quick-EDD/HA commands
Messages for the replication
Management of the objects linked to replication
21 26/09/2014
4.5. Global menu « EDH »
A global menu « EDH » allows you to manage all options from the two menus « PMEDH »
and «EDHTools » (tools of Quick-EDD/HA).
This is now the main menu of the product.
Note: The menu may differ from this document depending upon the release level of the
software.
Check that you have PMEDH and PMEDHTOOLS in your library list; then run command
EDH. The following menu is displayed:
Parameters Activity
1. Work with environments 6. Manage replications
2. Manage specific objects
3. Supervision parameters 7. Start supervision process
4. Monitoring options 8. Start msg monitoring
5. Monitoring filtering
Switch Others
11. Work with switch scenarios 21. Display EDH messages
12. Run switch scenario 22. Work with objects (PMEDHUSR)
23. Tools menu - EDHTools
Option, or command
===>______________________________________________________________________
__________________________________________________________________________
F3=Exit F4=Prompt F9=Repeat F12=Cancel
This menu introduces the different available options with one focus: to perform, in
sequence, all operations from the definition of settings (options 1 - 5), and then execution
of environments (options 6 – 8) to implementation of the additional tools (options 21 - 23).
ACTIVITY
6 – Manage replications
7 – Start supervision process
8 – Start monitoring
Note: Libraries PMEDH and PMEDHTOOLS must be in the library list of your job
before running EDH command.
Release numbers
Two release numbers are displayed on the main menu:
In the title, version is corresponding to the tools library (PMEDHTOOLS
library).
On the right, « Quick-EDD/HA: Vnnnnnn » is the release corresponding
with the product itself (PMEDH library).
Function keys:
F3=Exit Exit from job
F4=Prompt Prompt of command
F9=Recall Retrieve previous command
F12=Cancel Return to previous screen
The main menu includes a control of the license key. If the key is expired or if the
key will expire in the next 2 days a red message will be displayed on the top of the
screen: QSL/PMsoft software 13 not available since the 27/04/10 07:47:22
23 26/09/2014
5. PMEDH* - LIST OF COMMANDS BY
ALPHABETICAL ORDER
Some Quick-EDD/HA functions might be used externally in users ‘programs through
commands in the software library. Those commands will be detailed in appropriate
chapters of the documentation.
Solution:
ON THE TARGET SYSTEM
Add messages CPA5305 and CPA3138 to the system reply list
ADDRPYLE SEQNBR(1000) MSGID(CPA5305) RPY('9999')
ADDRPYLE SEQNBR(1001) MSGID(CPA3138) RPY('I')
(Change sequence number if needed)
Note: System reply list is unique; however, only jobs with description (JOBD)
using the value INQMSGRPY(*SYSRPYL) will be affected by the
modification.
25 26/09/2014
7. AUTOMATIC MANAGEMENT WITH
JOURNAL ENTRIES
Some Quick-EDD/HA functions can be managed with journal entries.
Hold an object *HLD
Release an object *RLS
Synchronize an object *SYN
Job follow-up *JOB
Stop the replication *END
Local command *CMD
Remote command *CMR
General control (I.O.A.) *CTG / *VFY // *VFN (Previously ignored objects)
SRS *IOE
SRS *IOS
SRS *JRE
Those functions could be automated by sending a journal entry in the audit journal this
way:
SNDJRNE JRN(QAUDJRN) TYPE('EH')
ENTDTA('*ACT**&OBJNAM&LOBJNAM&OBJTYPFORCE(*YES)')
Type Entry « EH »
ACTION Position 1 – Length 4 value (*HLD/*RLS/*SYN…)
Env. Code Position 5 – Length 2 (**=All)
OBJECT Position 7 – Length 10
LIB. Position 17 – Length 10
TYPE Position 27 – Length 8
FORCE Position 35 – Length 11 value ‘FORCE(*YES)’
Performing your backup on the target system is now really simple thanks to SRS.
You just have to write the CLP that will launch the backup; Quick-EDD/HA will handle the
rest of the process.
Sample 1
Your standard CLP that does backups on the source system is written as follows:
PGM
….
SAVLIB SAVLIB(XXXXX) DEV ...
SAVDLO ...
SAVCFG ...
….
ENDPGM
You already have tested this program. It must be present on target system.
All you have to do now is submit it using the remote command function from Quick-
EDD/HA.
For that you will only have to send a journal entry in the audit journal using a SNDJRNE.
This journal entry, using the special code *CMR (remote command) and containing the
CALL command will be sent on the target system and the program will be executed by a
Xnn server job of the environment.
Using such a scenario you can very easily make a backup from your target system
with only a standard CLP.
In case there are several environments, for example E1, E2 and E3, the target
CLP will be called by the command SNDJRNE JRN(QAUDJRN) TYPE('EH')
ENTDTA('*CMRE1CALL YYYYYY') and must contain :
27 26/09/2014
PGM
SBMJOB (PMEDHMOD ENV(E2) ENT(*RCV) MOD(*NOIO)…)
SBMJOB (PMEDHMOD ENV(E3) ENT(*RCV) MOD(*NOIO…)
….
SAVLIB SAVLIB(XXXXX) DEV ...
SAVDLO ...
SBMJOB (PMEDHMOD ENV(E2) ENT(*RCV) MOD(*IO)…)
SBMJOB (PMEDHMOD ENV(E3) ENT(*RCV) MOD(*IO)…)
ENDPGM
Sample 2
Depending on companies it is sometimes required to validate the backups, and if the
backup failed IT IS MANDATORY to run the backup again AT THE SAME POSITION.
Using sample one it is not possible to do that. When the CLP is finished – even if it failed -
the environment is turned back to *IO automatically.
If a customer wants to validate backups as described, the CLP will have to be more
sophisticated to manage separately the SRS and the backup itself:
The backup can be submitted either by the source or the target system
The main CLP program will use the PMEDHMOD command to turn the environment to
*NOIO
The backup will be executed – May be a CALL or independent SBMJOB
Option 1 – The backup CLP is well monitored and if it finishes normally the
PMEDHMOD command is used to turn back the environment to *IO
Option 2 – If the CLP fails for any reason, the environment stays in *NOIO and
the operator can take the decision to run a new backup or to manually turn the
environment back to *IO.
In this case the most important is that the PMEDHMOD is not managed automatically
by the *CMR function but as an independent function. The decision to reactivate the
*IO process will require a program or a manual operation done by the operator.
If the batch jobs failed and you don’t want to apply their result on the target system, you
only have to stop the environment and delete the objects PMEDHUSR/XX_RCO*. Then,
you can skip the journal entries related to the batch jobs, and restart replication.
Note: A third type of journal entry,*JRE, allows you to stop the reading of the journals
on the source system. In such a case, the PMEDHMOD command, used with option
*JRN, allows you to restart reading the journals.
Attention: journal entries *JRE, *IOE and *IOS are processed by the SND job.
Therefore, if you send a *IOE entry when there is latency; *IOS entry might be sent
too soon: the environment will remain in *NOIO until someone notices it.
29 26/09/2014