GT.M - Multi Site Replication Support on UNIX

Technical Bulletin: GT.M - Multi Site Replication Support on UNIX

Legal Notice

June 14, 2006

Revision History
Revision 1.014 June 2006



                        GT.M Group

                        Fidelity National Information Services, Inc.

                        2 West Liberty Boulevard, Suite 300

                        Malvern, PA  19355, 

                        United States of America

                    



                        GT.M Support: +1 (610) 578-4226 

                        Switchboard: +1 (610) 296-8877 

                        Fax: +1 (484) 595-5101

                        http://www.fis-gtm.com

                        gtmsupport@fnf.com

                    

Table of Contents

Overview
Summary
User Interface
Replication Instance File
MUPIP REPLIC –INSTANCE_CREATE –NAME
-INSTSEC[ONDARY] qualifier
-ROOT[PRIMARY] and –PROPAGATE[PRIMARY] Qualifiers
Obsolete –UPDATE Qualifier
MUPIP REPLIC –SOURCE –START
MUPIP REPLIC –RECEIVER –START
MUPIP REPLIC –SOURCE –ACTIVATE
MUPIP REPLIC –SOURCE –DEACTIVATE
Starting up an instance
Shutting down an instance
Primary and Secondary transitions
Refreshing secondary from backup of primary
Lost transaction processing
MUPIP REPLIC –SOURCE –NEEDREST[ART]
MUPIP REPLIC –SOURCE –LOSTTN[COMPLETE]
MUPIP REPLIC –EDIT[INSTANCE]
MUPIP REPLIC -SOURCE –JNLPOOL
DSE DUMP -FILE[HEADER]
DSE CHANGE -FILE[HEADER]
MUPIP JOURNAL -ROLLBACK
MUPIP BACKUP
Rolling Upgrade
From a GT.M Dual-site version to a GT.M Multi-site version
Error Messages
Typographical Conventions

GT.M V5.1-000 provides the capability to deploy an application in a logical multi-site configuration with multiple secondary sites to a single primary site.

Prior versions of GT.M featured the capability to deploy an application in a logical dual-site configuration with only one secondary site.

A configuration having a primary and secondary in proximity for operational efficiency, however, would not provide protection against a disruption that affects both systems. A separate and distant "disaster recovery" (DR) third system can provide the operational convenience of proximal systems for routine operations, and a distant system for continuity of business in the face of catastrophic events. Prior GT.M versions did not make it possible to set up multiple secondary and/or tertiary systems. GT.M V5.1-000 enables such multi-site configurations. For migration to the new version with continuity of business, a dual-site configuration where one site runs on a GT.M version that has multi-site replication enabled and the other site runs on a GT.M version that does not, is supported.

[Note]

The terms site and instance are used interchangeably throughout this technical bulletin to refer to a primary or secondary system that participates in replication. Please also note that GT.M imposes no restrictions on the number of instances that can reside on a given machine.

[Note]

A configuration that uses GT.M replication between one primary and one secondary is henceforth termed a dual-site configuration while a configuration that uses GT.M replication between one primary and more than one secondary and/or tertiaries is henceforth termed a multi-site configuration. Likewise, replication between one primary and one secondary is termed dual-site replication. Replication between one primary and more than one secondary and/or tertiaries is termed multi-site replication.

GT.M V5.1-000 supports replication from one primary site to multiple secondary sites. As in prior versions of GT.M, at any given instant, only one primary instance performs updates. In GT.M V5.1-000, this primary can concurrently replicate to as many as sixteen (16) additional instances. Any of the secondary sites can become the new primary, in the event of an unplanned or planned outage of the primary.

After recovering from the outage, the original primary becomes a secondary, potentially generating a lost transaction file to be sent to the new primary for (re)processing.

The multi-site configuration capability permits a secondary instance to pass the transactions to a tertiary instance. The flow of transaction data is from the primary to the secondary to the tertiary. Herein, the secondary acts as a primary for the tertiary. Thus, if each of the 16 secondary instances were to feed 16 tertiary instances, there could be 273 instances (1 primary + 16 secondaries + 256 tertiaries). If the tertiary instances fed quaternary instances, there could be 4,369 instances (1 primary + 16 secondaries + 256 tertiaries + 4096 quaternaries). And so on.

Any arbitrary reconfiguration of the instances would be feasible, as any instance in the tree of instances below the primary could potentially become a new primary, if the current primary comes down.

[Note]

A tree structure is required for replication and cycles are not supported.

[Caution]

Just because GT.M permits arbitrary reconfiguration of the instances, it does not mean that an application or a specific deployment of the application should permit such an arbitrary reconfiguration. Each application deployment should have a specific configuration and a strategy for dealing with unplanned and planned outages.

To differentiate the real primary from the secondary that is also acting as a primary (for the tertiary), GT.M V5.1-000 introduces the notion of a root primary versus a propagating primary.

The instance on which business logic is executed, and resulting database updates are computed & commited, is termed the root primary. The secondary that acts as a primary (to a tertiary) is termed a propagating primary. This can be further extended to allow the tertiary replicate to a quaternary and so on. In such a case, the tertiary also acts as a propagating primary. There can be any number of propagating primaries but only ONE root primary in GT.M V5.1-000.

GT.M process updates to replicated regions are disabled on all instances except the root primary (note that mupip reorg and database repair in the unlikely event of structural database damage needing repair with DSE, are not considered logical updates, and are permitted on the secondary). To identify whether the current instance is a root primary or a propagating primary the source server startup command now has an optional qualifier, -rootprimary or -propagate primary.

Central to multi-site replication support is the notion of an instance name. Each replication instance is uniquely identified by a name that can be from 1 to 15 alphanumeric characters. This name is stored in the replication instance file of that instance. The instance name uniquely identifies every site in the multi-site configuration, as each site has only one replication instance file.

[Caution]

The instance name should be unique and two instances should not have the same name.

[Note]

The instance name is not changeable once the instance has served the role of either a primary (root or propagating) or a secondary since the replication instance files of all the corresponding secondaries and/or primaries connected to the instance maintain a record of the instance name.

Since there can be multiple source servers running (one per active secondary) on a primary instance, source server commands need to specify secondary instance names to identify specific source servers.

As indicated previously, a secondary running on GT.M V5.1-000 is supported with a primary running a prior version of GT.M and vice versa, which will occur in a rolling upgrade. Multiple secondary instances are allowed only after both the primary and secondary have been upgraded to GT.M V5.1-000. Likewise, tertiaries are allowed from a secondary only after both the primary and secondary have been upgraded to GT.M V5.1-000.

[Note]

Additional features of multi-site replication are available only when both the sites in an existing dual-site configuration are upgraded to GT.M V5.1-000.

Return to top

User Interface

The changes to the user interface in V5.1-000 apply in both multi-site and dual-site configurations (e.g., if an instance name is required by a command, it is required even when operating in a dual-site configuration.

Return to top

Replication Instance File

In order to use replication in dual-site or multi-site mode on UNIX, new replication instance files need to be created. A REPLINSTFMT error occurrs if a replication instance file from a previous version of GT.M is used. A FILENOTFND error occurs if the instance file does not exist and replication is attempted.

The name given at the time of creating the instance file uniquely identifies the instance and is stored in the replication instance file.

The instance file serves as a repository of the history of the journal sequence numbers that are generated locally or received from other instances. The history is maintained as a set of records. Every record identifies a range of journal sequence numbers and the name of the root primary instance that generated those journal records. The first history record starts with the current journal sequence number of the instance. When a root primary and secondary communicate, the primary instance name is recorded in the replication instance file history of both the primary and the secondary as the instance that generated the transmitted journal sequence numbers. When a tertiary connects to a secondary, it is still the root primary instance name (not that of the secondary which serves as a propagating primary) that gets recorded in the tertiary instance file since that is the instance that actually generated the records. History records are always added to the tail of the instance file, the only exception being records removed from the tail of the instance file when updates are rolled back from the database as part of a mupip journal –rollback.

This history is crucial to determining the journal sequence numbers through which both instances are synchronized when a primary and secondary (or secondary and tertiary) attempt to connect. This journal sequence number is determined by going back in the history of both the instance files and finding the earliest shared journal sequence number that was generated by the same root primary instance. The receiver server on the secondary continues with normal replication only if the shared journal sequence number determined above matches the current journal sequence number of the secondary instance. Otherwise, a mupip journal –rollback –fetchresync must be performed on the secondary to rollback the secondary to a common synchronization point from which the primary can transmit updates to allow it to catch up. To avoid a possible out-of-sync situation, it is advisable, and safe even if it is not strictly necessary, so it can be unconditionally scripted, to perform a mupip journal -rollback -fetchresync prior to starting any source servers on the secondary instance.

Processes such as source servers, receiver server, and mupip rollback access this history. A REPLINSTNOHIST error message is generated if they attempt to look up a history record corresponding to a sequence number (for example, as part of a -rollback operation) that is less than the earliest sequence number recorded in the instance file.

A replication instance file maintains the current state of the instance and it is necessary to take a backup of this file that is consistent with the snapshot of the database files in the backup. Mupip journal -backup allows for the backup of the instance files at the same time that it backs up the database files.

The replication instance file on the primary stores the information pertaining to each secondary for which a source server is started and the journal sequence number that was last transmitted to that secondary. In prior versions of GT.M, this journal sequence number was maintained as Resync Seqno in the database file headers of all replicated regions and a dse dump -fileheader would display this information. With GT.M V5.1-000, this information is only available in the replication instance file and mupip replic –source –showbacklog command uses this information to display the backlogs for secondary instances.

The instance file has 16 slots to store information pertaining to a maximum of 16 secondary instances. Initially, all the slots are unused. A source server replicating to a secondary for the first time utilizes an unused slot to store the information related to that secondary and any future source server process replicating to the same secondary will update this information.

If an unused slot is not available, the first time the source server starts, the slot for the least recently started secondary is reused, assuming of course that the source server for the secondary is not alive, and the information that is previously stored in that slot is overwritten. Any subsequent mupip replic –source on the preempted secondary instance generates a REPLINSTSECNONE message.

[Note]

Preemption of slots is not an issue if the primary does not connect to more than 16 different secondaries throughout its lifetime.

If the replication instance file is damaged or deleted, a new instance file must be created, and all secondaries downstream must be recreated from backups.

Return to top

MUPIP REPLIC –INSTANCE_CREATE –NAME

An instance name must be specified when the replication instance file is being created using the –name qualifier in the command line.

The name identifies the replication instance to other instances, and it is immutable.

[Note]

In this rare instance, a GT.M command from a prior version is not upward compatible in GT.M V5.1-000.

If -name is not specified, mupip uses the environment variable gtm_repl_instname for the name. If that variable is not defined, the command issues a REPLINSTNMUNDEF error message. Using the environment variable allows pre-existing user replication scripts to run without any changes. Explicitly specifying the qualifier in the scripts improves clarity.

The name can be from 1 to 15 characters. Specifying a name longer than 15 characters or an empty string will issue a REPLINSTNMLEN error message.

If an instance file already exists, it is renamed with a timestamp suffix, and a new one is created. This behavior is similar to the manner in which GT.M renames existing journal files while creating new journal files.

Creating an instance file requires standalone access. A REPLINSTSTNDALN message is issued if the instance file is being used (i.e. the journal pool for that instance exists), while attempting to (re)create that instance file.

Return to top

-INSTSEC[ONDARY] qualifier

On a primary instance, a source server process runs for each secondary instance. The –instsecondary qualifier identifies the secondary instance to which a source server replicates the data. In GT.M V5.1-000, the following commands have been modified to include the –instsecondary qualifier. It is mandatory to specify the –instsecondary qualifier while issuing these commands.

   	mupip replic –source –start
   	mupip replic –source –deactivate
   	mupip replic –source –activate
   	mupip replic –source –stopsourcefilter
   	mupip replic –source –changelog
   	mupip replic –source –statslog
   	mupip replic -source -needrestart
	

If the –instsecondary qualifier is not specified, mupip uses the environment variable gtm_repl_instsecondary for the name of the secondary instance. If that variable is not defined, a REPLINSTSECUNDF error is issued.

[Note]

In this rare instance, a GT.M command from a prior version is not upward compatible in GT.M V5.1-000.

Specifying a name longer than 15 characters or an empty string issues a REPLINSTSECLEN error message.

The following source server commands now allow the –instsecondary qualifier as an optional qualifier:

	mupip replic –source –checkhealth
	mupip replic –source –showbacklog
	mupip replic –source –shutdown 
	

If the –instsecondary is not specified with the above commands or if the environment variable gtm_repl_instsecondary is not defined, the commands operate on ALL current active and passive source servers.

The secondary instance name needs to be specified explicitly (through the qualifier) or implicitly (through the environment variable) even for starting a passive source server. It is this secondary instance name that the passive source server will later connect to when it is activated.

If the secondary instance is unnamed, for example if the secondary instance is running an older version of GT.M, a dummy name suffices.

The above source server commands excepting the –start command look for an existing slot corresponding to the specified secondary instance in the replication instance file. If none is found, a REPLINSTSECNONE error is issued. In case of –start, GT.M looks for a previously used slot for that secondary instance. Failing that, it looks for an empty slot, and if none is available, it takes over the slot from the least recently used secondary instance. If there are already 16 source servers, there are no open slots, and a SRCSRVTOOMANY error is issued. If the name does not match the name sent by the receiver server on connection, a REPLINSTSECMTCH error is issued.

Defining the environment variable appropriately allows pre-existing user replication scripts to run without any changes. Explicitly specifying the qualifier in the scripts improves clarity.

Return to top

-ROOT[PRIMARY] and –PROPAGATE[PRIMARY] Qualifiers

The following source server commands now support –rootprimary and –propagateprimary qualifiers:

	mupip replic –source –start –secondary=...
 	mupip replic –source –activate
	mupip replic –source –start –passive
 	mupip replic –source –deactivate
 	

These are optional qualifiers. If not specified, a default value of –propagateprimary is used for the passive source server startup command as well as the deactivate command and a default value of –rootprimary is used for the other two commands. These default values enable users who are planning to limit themselves to one primary and multiple secondaries (without any tertiaries) to perform minimal changes, if any, to their replication scripts. However, it is recommended that these qualifiers be specified explicitly in the scripts, for clarity.

The –rootprimary or the –propagateprimary qualifiers are mutually exclusive.

Specifying –rootprimary either explicitly or implicitly enables GT.M updates on the specified replication instance.

Specifying –propagateprimary explicitly or implicitly disables GT.M updates on the replication instance. An attempt by a GT.M process to update a replicated region issues a SCNDDBNOUPD error.

Return to top

Obsolete –UPDATE Qualifier

GT.M V5.1-000 does not support the –update qualifier in the mupip replic –source command. In prior GT.M versions, mupip replic –source –deactivate disabled updates while –activate enabled updates. In GT.M V5.1-000, –rootprimary and –propagateprimary control whether updates are enabled or disabled.

[Note]

GT.M updates to replicated regions are permitted only on the root primary instance and disabled on ALL other instances (including secondary, tertiary, etc.).

[Note]

In this rare instance, a GT.M command from a prior version is not upward compatible in GT.M V5.1-000.

Return to top

MUPIP REPLIC –SOURCE –START

If –rootprimary is specified either explicitly or implicitly and the journal pool already exists and has updates disabled (i.e. journal pool was created by a source server startup command that explicitly or implicitly specified –propagateprimary), a PRIMARYNOTROOT error is issued. To transition a propagating primary instance to a root primary instance without bringing down the journal pool, issue a mupip replic -source -activate -rootprimary for an already running passive source server (start one with -propagateprimary if none is running).

If –propagateprimary is either explicitly or implicitly specified, the journal pool already exists and has updates enabled (i.e. journal pool was created by a source server startup command that explicitly or implicitly specified –rootprimary), a PRIMARYISROOT error is issued. It is not possible to transition a root primary instance to a propagating primary instance without bringing down the journal pool.

If journal pool already exists when source server startup is issued, prior versions of GT.M would issue a “Source Server already exists” error message. In GT.M V5.1-000, multiple source servers can be started from the same replication instance as long as each of them specify a different name for the –instsecondary qualifier. If a source server is started with a secondary instance name that matches the primary instance name, a REPLINSTNMSAME error message is issued. If a source server is started with a secondary instance name that has a corresponding source server already up and running, a SRCSRVEXISTS error message is issued.

On a root primary or propagating primary instance, multiple source servers cannot be started as long as there is one source server actively connected to a dual-site secondary (i.e. a secondary running on a version of GT.M that does not support multi-site functionality). An attempt to do so results in a REPLUPGRADESEC error. If an active source server on an instance finds more than one source server (active or passive) running on that instance at the time it connects to a dual-site secondary it issues a REPLUPGRADESEC error.

On a propagating primary instance, a source server that connects to a dual-site tertiary issues a REPLUPGRADESEC error. In addition, starting an active source server or activating a passive source server in a propagating primary issues a REPLUPGRADEPRI error if the receiver server is actively connected to a dual-site primary.

In an instance, a maximum of 16 active and/or passive source servers are allowed at any point in time. If 16 source servers are already running and another source server startup is attempted, it issues a SRCSRVTOOMANY error message.

To transition an instance from being a root primary to a secondary, it must first be brought back up as a secondary of the current (new) root primary instance. Bringing it up as a secondary causes a lost transaction file to be created. This lost transaction file needs to be applied on the current root primary instance followed by running a mupip replic -source -losttncomplete command on the current root primary instance (as discussed elsewhere, during this time, either the former root primary should be running as a secondary to the current root primary, or the mupip replic -source -losttncomplete command should be separately run on the former root primary). Once the mupip replic -source -losttncomplete command has been run, the former root primary instance can be brought up as the secondary of any propagating primary. An attempt to bring a former root primary up as a secondary of a propagating primary instance prior to the mupip replic -source -losttncomplete command causes the receiver server and the source server (running respectively on the secondary and primary instances) to issue a PRIMARYNOTROOT error, when they first communicate with each other.

[Note]

The former root primary instance can be brought up as the secondary of any propagating primary, once a mupip replic -source -losttncomplete command has been successfully run, either on the current (new) primary when the former root primary instance is connected to the current root primary, or when the command has been run separately on both the current primary and the former primary.

Return to top

MUPIP REPLIC –RECEIVER –START

If the journal pool has updates enabled (i.e. journal pool was created by a source server startup command that explicitly specified or implicitly assumed –rootprimary), starting the receiver server on that instance issues a PRIMARYISROOT error message.

If the receiver server finds any active source server running on the secondary instance at the time it connects to a primary running a prior version of GT.M, which does not support multi-site replication, it issues a REPLUPGRADEPRI error.

If an instance transitions from being a root primary to a secondary, it is necessary that it be brought up as a secondary of the current root primary instance until a mupip replic -source -losttncomplete command has been run implicitly or explicitly on that instance. Not doing so causes the receiver server and the source server (running respectively on the secondary and primary instances) to issue a PRIMARYNOTROOT error, when they first communicate with each other.

Effective GT.M V5.1-000, the receiver server is an additional process attaching to the journal pool, which, means that the last source server running on any instance should be shutdown only after shutting down the running receiver server. Otherwise, a message is logged in the source server log indicating that the journal pool shared memory was not removed as processes were still attached to it.

Return to top

MUPIP REPLIC –SOURCE –ACTIVATE

If no source server is running for the secondary instance name specified in the activation command (the new -instsecondary qualifier), a SRCSRVNOTEXIST error is issued.

Assuming the passive source server is already running, there are four cases:

  1. If –rootprimary is specified explicitly or implicitly and the journal pool has updates enabled, this command sets a flag in the journal pool to document the impending activation and returns successfully. This flag is detected within a few seconds by the concurrently running passive source server, which then transitions to being an active source server.

  2. If –rootprimary is specified explicitly or implicitly and the journal pool has updates disabled, this command checks if there are any processes attached to the journal pool other than just the single passive source server. If yes, then it issues a JNLPOOLACTIVATE error message and exits. If not, this command enables updates in the journal pool, updates the replication instance file to indicate that the instance is henceforth a root primary, sets a flag in the journal pool to document the impending source server activation and returns successfully. This flag is detected within a few seconds by the concurrently running passive source server, which then transitions to being an active source server.

  3. If –propagateprimary is specified explicitly or implicitly and the journal pool has updates enabled, this command issues a PRIMARYISROOT error message.

  4. If –propagateprimary is specified explicitly or implicitly and the journal pool has updates disabled, this command sets a flag in the journal pool to document the impending activation and returns successfully. This flag is detected within a few seconds by the concurrently running passive source server, which then transitions to being an active source server.

Return to top

MUPIP REPLIC –SOURCE –DEACTIVATE

If no source server is running for the secondary instance name specified in the activation command (the new -instsecondary qualifier), a SRCSRVNOTEXIST error is issued.

Assuming the active source server is already running, there are four cases.

  1. If –rootprimary is specified explicitly or implicitly and the journal pool has updates enabled, this command sets a flag in the journal pool to document the impending deactivation and returns successfully. This flag is detected within a few seconds by the concurrently running active source server, which then transitions to being a passive source server.

  2. If –rootprimary is specified explicitly or implicitly and the journal pool has updates disabled, this command issues a PRIMARYNOTROOT error message.

  3. If –propagateprimary is specified explicitly or implicitly and the journal pool has updates enabled, this command issues a PRIMARYISROOT error message.

  4. If –propagateprimary is specified explicitly or implicitly and the journal pool has updates disabled, this command sets a flag in the journal pool to document the impending deactivation and returns successfully. This flag is detected within a few seconds by the concurrently running active source server, which then transitions to being a passive source server.

Return to top

Starting up an instance

In the GT.M user documentation, unless the distinction is important in the context, a root primary is referred to simply as primary and a propagating primary is referred to simply as secondary. Note that any secondary can also be a propagating primary.

In order to bring up an instance as a primary, user startup scripts run the same steps they previously did to bring up the instance as primary. In addition, they can now bring up multiple source servers. As before, to create the journal pool, at least one source server must be brought up before any GT.M processes are activated. However, additional source servers can be brought up any time.

In order to bring up an instance as a secondary, user startup scripts run the same steps they previously did to bring up the instance as secondary. In addition, they can now bring up one or more active or passive source servers as well if they want to replicate to a tertiary. Note that the tertiary, if needed, can be brought up at any convenient time before or after the secondary is brought up. The replication stream between the secondary and tertiary is established when both are up & running, and connected to each other.

When a primary or secondary is upgraded to GT.M V5.1-000 and is brought up for the first time, the receiver server startup command on the secondary should additionally specify the –updateresync qualifier. Not specifying this qualifier causes the receiver server to fail at startup with a REPLINSTNOHIST message. The secondary can be shut down and brought up without the -updateresync qualifier only after at least one database update has been sent through the replication pipe and processed by the update process.

Return to top

Shutting down an instance

In order to bring down an instance that is running as primary ALL source servers (active or passive) running on the instance must be shut down. This means that one needs to first shut down all GT.M, mupip reorg, mupip load etc. processes that could be attached to the journal pool and then shut down all active and/or passive source servers (using individual mupip replic –source –shutdown -instsecondary commands or one mupip replic -source -shutdown command without an -instsecondary qualifier).

In order to bring down an instance that is running as secondary there is no change unless that secondary is also acting as a primary (i.e. that is a propagating primary). In such a case, after shutting down the secondary related servers (receiver server, update process and its helper processes), one also needs to shutdown the primary related servers (all active and/or passive source servers). In the secondary, there should not be any other GT.M or MUPIP processes attached to the journal pool as updates are disabled (they are enabled only on the primary). If there are processes like mupip reorg that operate on the database but do not attach to the journal pool, these processes also need to be brought down and finally a mupip rundown –reg * needs to be performed to ensure the database, journal pool and receive pool shared memory is all rundown.

At shut down, at least one source server – not necessarily, the one originally brought up first – must be the last GT.M process to be shut down, and it cleans up the journal pool when it exits.

Return to top

Primary and Secondary transitions

Any instance can undergo one of the following four state transitions:

  1. primary coming up as primary

  2. secondary coming up as primary

  3. primary coming up as secondary

  4. secondary coming up as secondary

There is no change to the sequence of commands to shutdown a primary or secondary and bring it back as a primary or a secondary other than those already outlined in Starting up an instance or Shutting down an instance.

Some points to note are:

  1. A primary must be shut down before it can be brought up as a secondary i.e. it is not possible to run mupip replic –source –deactivate on the primary to transition it on the fly to a secondary.

  2. A secondary on the other hand can be brought up as a primary without being shut down. To do that, one needs to issue a mupip replic –receiv –shut (to shut down the receiver server, update process and helper processes) followed by a mupip replic –source –activate –secondary= command to make a passive source server active.

    [Note]

    If an active source server is propagating updates, it can be deactivated to make it passive and then activated with the -rootprimary qualifier.

    This also makes it the root primary (unless the –propagateprimary flag is used, in which case it remains a secondary). Note that this works only if there is a single source server running on the secondary and it is a passive server. If not, all active source servers and all passive source servers except for one passive source server are to be brought down before being activated.

  3. Any instance coming up as secondary must do a mupip journal –rollback –fetchresync with the current primary, regardless of whether it was previously a secondary or primary. Not doing this could cause errors (secondary ahead of primary etc.) during receiver server startup.

Return to top

Refreshing secondary from backup of primary

With GT.M V5.1-000, there are minor changes in the procedures used to recover the secondary when the databases/journals get out-of-sync with the primary.

The procedure to recover the secondary was to take an online backup of the primary databases, ship it to the secondary, create fresh journals on the secondary and start the receiver server on the secondary with the -updateresync qualifier. With GT.M V5.1-000, there is an additional step, which is to recreate the replication instance file (at the time fresh journals are created) with the same name, as before. Also, there are more choices in selecting the backup source. One can take an online backup from either the root primary or another secondary.

Optionally, to reduce network bandwidth and system load on the primary from which the new instance has to catch up, the mupip journal -recover -forward command was used to apply updates from journal files created on the primary subsequent to the backup before bringing the new instance up as a secondary. This procedure is now changed.

The new procedure to bring up a secondary instance from a backup is described below:

  1. Restore the backup (from a primary or another secondary instance). Note that in the event of a multi-region database, the backup must be consistent across all regions.

  2. Optionally, apply the journal files (from the same instance where the database backup was taken from) with a mupip journal -recover -forward command. There can be multiple journal files for each database region. For each region, perform a mupip journal -show=header -forward on the latest journal file, and note the largest End Region Sequence Number displayed. Use the dse change -file -reg_seqno command on the corresponding region to set the region sequence number to be equal to the noted number. Perform the same for all replicated regions in the instance.

  3. Create new journal files for the restored database regions.

  4. Create a new instance file with the same name as the previous instance file on the secondary instance.

  5. Start a receiver server with the -updateresync qualifier

Return to top

Lost transaction processing

A mupip journal -rollback -fetchresync command is now always required whenever a secondary is brought up. Consider the case where A is replicating to B and C; B is replicating to D and C is replicating to E. After a failover in which B becomes the new primary, C needs to use the -fetchresync qualifier before coming up as a secondary to B. D continues operating normally. In the event C's -fetchresync requires it to rollback some transactions, E (and any other instances that are part of a tree with C as its root) also need to be brought down, and a mupip journal -rollback -fetchresync performed, before it can be brought back up again. When the original primary, A, becomes a new secondary (to B), it first needs to execute a mupip journal -rollback -fetchresync before it can become the secondary. These mupip journal -rollback -fetchresync commands potentially generate lost transaction files on the respective secondaries, but the only one that needs to be applied to B is the lost transaction file from A, the original primary. The format of the lost transaction file has therefore been altered to make it easy to distinguish whether the file needs to be sent to the new primary for reprocessing.

A new field in the first line of the lost transaction file identifies a lost transaction file that needs to be applied. Previously, the first line featured the format of the file as the first and only field and it was a string of the form GDSJEXnn where nn was the existing format of the lost transaction file.

The current version features multiple fields in the first line. These are as follows:

  • The first field displays GDSJEX03 signifying the file format version.

  • The second field indicates whether it is rollback or recover that created the lost transaction file.

    • If the second field is rollback, a third field indicates whether the instance was a primary or secondary immediately prior to the lost transaction file being generated. If it was a primary, the lost transaction file is termed a primary lost transaction file and it needs to be applied on the new primary. A fourth field holds the name of the replication instance on which the lost transactions were generated. This instance name should be used as the -instsecondary qualifier in the mupip replic -source -needrestart command when the lost transactions are applied at the new primary.

Except following a failover when the backlog is zero, whenever a former primary comes up as a new secondary, there is a lost transaction file. This lost transaction file should be applied at the new primary as soon as practicable, as, there could be additional lost transaction files in the event of other failovers, e.g., triggered by a failure of the new primary, before the lost transaction file is processed. These additional lost transaction files can complicate the logic needed for lost transaction processing.

Lost transactions can be applied on the new primary either manually or in a semi-automated fashion using the M-intrinsic function $ZQGBLMOD(). If $ZQGBLMOD() is used, two additional steps ( mupip replicate -source -needrestart and mupip replicate -source -losttncomplete ) are to be performed as part of lost transaction processing, regardless of whether replication is configured in dual-site or multi-site mode. Failure to run these steps can cause $ZQGBLMOD() to return false negatives that in turn can result in application data consistency issues.

MUPIP REPLIC –SOURCE –NEEDREST[ART]

The mupip replic –source –needrestart command should be invoked once for each lost transaction file that needs to be applied. It should be invoked on the new primary before applying lost transactions from that file. The –instsecondary qualifier needs to be specified in this command to provide the instance name of the secondary where this lost transaction file were generated. If not, the environment variable gtm_repl_instsecondary is implicitly assumed to hold the name of the secondary instance. If that is not defined either, a REPLINSTSECUNDF error is issued.

 mupip replic –source –needrestart –instsecondary=...
 

The purpose of this command is to check whether the primary ever communicated with the specified secondary instance (if the receiver server or a fetchresync rollback on the secondary instance communicated with the source server) since the primary was brought up. If so, this command displays the message SECONDARY INSTANCE xxxx DOES NOT NEED TO BE RESTARTED indicating that the secondary instance did communicate with the primary and hence does not need to be restarted. If not, this command displays the message SECONDARY INSTANCE xxxx NEEDS TO BE RESTARTED FIRST. In this case, the specified replication instance needs to be brought up as a secondary before the lost transactions from this file are applied. Failure to do so before applying the corresponding lost transactions causes $ZQGBLMOD() to return false negatives which can result in application data inconsistencies as mentioned earlier.

In the event the lost transaction file was generated from the same instance to which it is to be applied, a mupip replicate -source -needrestart command is not required, and if attempted results in a REPLINSTNMSAME error.

It is necessary for the secondary instance (specified in the -needrestart command) to be brought up as an immediate secondary of the current primary. If it is brought up as a tertiary to the current primary (through a different intermediate secondary instance), the -needrestart command unconditionally considers the tertiary instance as not having communicated with the primary even though the tertiary might be up and running.

[Note]

The source server on the primary and/or receiver server or fetchresync rollback on that particular secondary need not be up and running at the time this command is run. However, it is adequate if they were up at some point in time after the primary instance was brought up.

This command protects against a case where the primary instance when the lost transaction file is generated is different from the primary instance when the lost transactions are applied (note that even though they can never be different in case of a dual-site configuration, use of this command is nevertheless still required).

$ZQGBLMOD() relies on two fields in the database file header of the primary instance to be set appropriately. Prior versions of GT.M displayed one of these fields as Resync Trans (using the dse dump –fileheader command). This is now changed to Zqgblmod Trans in GT.M V5.1-000. A new field Zqgblmod Seqno has been introduced in GT.M V5.1-000 and it is displayed by the dse dump -fileheader command.

In a dual-site configuration, there are only two instances and both of them participate in the determination of the journal sequence number to which to roll back the secondary when a rollback –fetchresync is run. This makes it possible for both the instances to record the same sequence number information (Zqgblmod Seqno and the corresponding transaction number Zqgblmod Trans) in their respective file headers at the same time that lost transactions are generated without requiring any additional commands.

In a multi-site configuration, there are more than two instances, and no instances other than the primary and secondary involved in the rollback –fetchresync participate in the sequence number determination. Hence, they do not have their Zqgblmod Seqno (and hence Zqgblmod Trans) set when that particular lost transaction file is generated. If any of the non-participating instances is brought up as the new primary and that particular lost transaction file is applied on the primary, the return values of $ZQGBLMOD() will be unreliable since the reference point (Zqgblmod Trans) was not set appropriately. Hence, this command checks whether the secondary instance where the lost transaction was previously generated has communicated with the current primary instance after it came up as the primary. If it is affirmative, the Zqgblmod Seqno and Zqgblmod Trans fields would have been appropriately set and hence $ZQGBLMOD() values will be correct.

The -needrestart qualifier is incompatible with all source server qualifiers except -instsecondary.

MUPIP REPLIC –SOURCE –LOSTTN[COMPLETE]

Once lost transaction files from ALL secondaries that went down as primaries and came up as secondaries have been applied on the new primary, the following command should be run on the new primary to record the fact that lost transaction processing using $ZQGBLMOD() is complete.

  mupip replic –source –losttncomplete
  

This command updates the Zqgblmod Seqno and Zqgblmod Trans fields (displayed by a dse dump –fileheader command) in the database file headers of all regions in the global directory to the value 0. Doing so causes a subsequent $ZQGBLMOD() to return the safe value of one unconditionally until the next lost transaction file is created.

This command should ideally be also run on each of the secondary instances.

However, the command need not be run explicitly as the same effect is implicitly achieved, if the following conditions are true:

  • The secondary is already connected to the primary at the time the command is run on the primary.

  • The secondary is not connected at the time the command is run. However, it will eventually connect to the primary, before the primary is brought down after running the command.

The command has to be run on an instance either explicitly or implicitly to prevent a future $ZQGBLMOD() on that instance from returning false positives when applying future lost transactions. This ensures accuracy of future $ZQGBLMOD().

Running mupip replic -source -losttncomplete command explicitly or implicitly on a former primary instance is required before that instance can be started as a tertiary; otherwise Zqgblmod Seqno remains non-zero, and this could cause a PRIMARYNOTROOT error to be issued while starting a receiver server on that instance against a propagating primary source server.

The -losttncomplete qualifier is incompatible with all other mupip replicate -source qualifiers.

Return to top

MUPIP REPLIC –EDIT[INSTANCE]

This command is used to display or edit the contents of the replication instance file. The instance file name should be specified as the last parameter in this command. Not specifying the instance file name causes a MUPCLIERR error to be issued. Although editing the instance file does not require standalone access to the instance file, it should be avoided as far as possible when concurrent processes are reading/writing from/to it.

This command issues a FILENOTFND error if the instance file is not found and a REPLINSTFMT error message if the replication instance file was created by a GT.M version prior to V5.1-000.

The -editinstance qualifier is incompatible with other mupip replic qualifiers i.e. -receiver, -source, -updateproc, -updhelper, or -instance_create. To display the contents of the instance file use the -show qualifier i.e. a mupip replic –edit –show <instance-file> command. This command displays each of the following sections.

  1. File Header

  2. Source Server Slots

  3. History Records

If an optional -detail qualifier is additionally specified, all fields within each section are displayed along with their offset from the beginning of the file and the size of each field. This is used when there is a need to edit the instance file. To edit any displayed field, use the -change qualifier. In addition, one can specify the offset to change using the -offset qualifier, the size of the change using the -size qualifier and the new value with the -value qualifier.

The entire command to effect any edits is mupip replic –edit –change –offset= -size= -value= <instance-file>.

The -size qualifier can be either 1, 2, 4, or 8 indicating the bytes to be changed. Specifying any other values causes a SIZENOTVALID8 error to be issued.

The -offset qualifier takes a hexadecimal value that is a multiple of the size qualifier; otherwise, an appropriate error message is issued. Similarly, if the offset specified is greater than the size of the instance file, an appropriate error message is issued.

The -value qualifier specifies the new hexadecimal value at the specified offset. Not specifying the -value qualifier displays the current value at the specified offset and does not perform any changes. Specifying –value makes the changes and both the old and new values are displayed.

The -detail qualifier is incompatible with –change. Only one of -change or -show can be specified in a single command line. The -offset, -size and -value qualifiers are only compatible with -change.

[Note]

The instance file should be edited only when so instructed by Fidelity GT.M Support.

Return to top

MUPIP REPLIC -SOURCE –JNLPOOL

The mupip replic -source -jnlpool command allows one to display and edit the contents of the journal pool. The -show qualifier displays the contents of the journal pool. The optional qualifier -detail prints offsets and size of each field.

Mupip replic -source -jnlpool -show –detail displays the contents of the journal pool along with the offsets.

The -change qualifier allows one to edit the journal pool contents. The additional qualifiers -offset, -size, -value specify the actual changes as in the mupip replic –edit[instance] command.

The -detail qualifier is incompatible with –change. Only one of -change or -show can be specified in a single command line. The -offset, -size and -value qualifiers are only compatible with -change.

[Note]

This command should be used only when so instructed by Fidelity GT.M Support.

Return to top

DSE DUMP -FILE[HEADER]

In previous versions, dse dump -fileheader printed the field Resync Seqno, which represented the journal sequence number up to which both the primary and secondary are synchronized. With logical dual-site, there were only two instances at any point in time and hence from the perspective of each instance there is only one other instance. This made it possible to store the synchronization point with respect to the other instance in the database file headers of each instance. With multi-site, a given instance can act as a primary for more than one secondary instance and each secondary can be synchronized at different sequence numbers with respect to the primary. This information has been moved to the replication instance file where it is maintained separately for each secondary instance. In GT.M V5.1-000, dse dump -fileheader no longer prints the Resync Seqno field.

In prior versions of GT.M, Resync Trans was printed on the same line as Resync Seqno. This field is named Zqgblmod Trans in GT.M V5.1-000 and it serves as the reference transaction number for the M function $ZQGBLMOD(). In addition, the journal sequence number corresponding to Zqgblmod Trans is displayed in the place earlier occupied by Resync Seqno as Zqgblmod Seqno.

Thus, the last line of the dse dump -fileheader output will change from:

  Resync Seqno           0x0000000000000001  Resync Trans    0x0000000000000001
  

to

  Zqgblmod Seqno         0x0000000000000001  Zqgblmod Trans  0x0000000000000001
  

Return to top

DSE CHANGE -FILE[HEADER]

Due to the changes discussed in DSE DUMP -FILE, the dse change -fileheader command no longer supports the -resync_seqno and -resync_tn qualifiers. Instead, it supports -zqgblmod_seqno and -zqgblmod_tn in order to modify the Zqgblmod Seqno and Zqgblmod Trans fields of the database file header, respectively.

[Note]

In this rare instance, a GT.M command from a prior version is not upward compatible in GT.M V5.1-000.

Return to top

MUPIP JOURNAL -ROLLBACK

If an instance transitions from being a root primary to a secondary, it is necessary that it be brought up as a secondary of the current root primary instance until a mupip replic -source -losttncomplete command has been run implicitly or explicitly on that instance. Not doing so, causes a PRIMARYNOTROOT error to be issued by any mupip journal -rollback -fetchresync command that is run on this former primary instance trying to start up as a tertiary. The source server on the propagating primary instance that the rollback communicates with also issues the same error. Note that once a mupip replic -source -losttncomplete command has been successfully run on the instance, the rollback command can be run to enable bringing up the instance as a tertiary instance.

Return to top

MUPIP BACKUP

With GT.M V5.1-000, mupip backup has the ability to take a consistent online or offline backup of the replication instance file through a new optional qualifier -replinstance=<destination-file-or-directory>.

If a destination directory is specified, the instance file is backed up in the destination directory with the same name as the instance file. This qualifier can be specified along with a list of database regions to backup in which case a consistent backup of the instance file as well as all the database regions specified is obtained as of the start of the backup. This qualifier can also be specified without any other database regions in which case only the replication instance file is backed up. In this case, none of the other mupip backup qualifiers is applicable.

Return to top

Rolling Upgrade

Return to top

From a GT.M Dual-site version to a GT.M Multi-site version

Before using any features of the multi-site version of GT.M, both the primary and secondary in the existing dual-site configuration must be upgraded to the new version of GT.M that supports multi-site replication. This GT.M version upgrade can be done using a rolling software upgrade procedure that provides continuous application availability during the upgrade as discussed in the documentation and as summarized below.

Assuming that Site A and Site B are the two sites and that Site A is the primary before the upgrade, here are the steps to perform the rolling upgrade:

  • Site A continues to operate normally.

  • On Site B:

    • Shut down application processes, and the receiver & source servers.

    • Run down the database and make a backup copy.

    • Upgrade GT.M to the new version that supports multi-site replication.

    • If this is a version upgrade from GT.M V4 to GT.M V5, the database also needs to be upgraded using the procedure in the GT.M Database Migration Technical Bulletin. Note that for the sake of simplicity, Fidelity recommends separating the two upgrades.

    • Recreate the replication instance file using mupip replic -instance_create from the new GT.M version. This is required as the format of the instance file has changed. Note that an instance name now has to be specified while creating a replication instance file.

    • Bring up the secondary on Site B. The receiver server should be started with the -updateresync qualifier.

  • At this point, dual-site replication is restored with Site B running the new GT.M multi-site version and Site A running the old GT.M dual-site version. Operate in this mode as needed to verify correct operation.

  • Perform a controlled failover. Make Site A the secondary and Site B the primary.

  • Site B continues to operate normally as the new primary.

  • Upgrade Site A following the same procedure used to upgrade Site B.

  • At this point, dual-site replication is restored with both sites running the new GT.M multi-site version, Operate in this mode as needed to verify correct operation.

  • Perform a controlled failover. Make Site A the primary and Site B the secondary. Operate in this mode as needed to verify correct operation.

  • You can now add additional secondaries, and tertiaries as appropriate.

Return to top

Error Messages

ACTIVATEFAIL

Failed to activate passive source server for secondary instance name xxxx

Severity:

Error

MUPIP Error:

This error is issued by a mupip replic –source –activate command when it tries to activate a passive source server and at the same time switch the instance from being a secondary (propagating primary) to a root primary. It is possible to switch an instance from being a secondary to a root primary by activating an already running passive source server on that instance at the same time specifying the rootprimary qualifier. But there should be no process other than the one passive source server running on that instance. If not the above error is issued.

Action:

Shutdown all processes other than the passive source server that are running on that replication instance (source servers, receiver server, update process, GT.M processes etc.) and are accessing the journal pool and then reissue the mupip replic –source –activate command.

PRIMARYISROOT

Attempted operation not valid on root primary instance xxxx

Severity:

Error

MUPIP Error:

If a replication instance is a root primary (the journal pool already exists and was created by a source server command that specified rootprimary), issuing a source server command with the start, activate or deactivate qualifiers that has the propagateprimary qualifier explicitly specified (or implicitly assumed) or attempting to start a receiver server on this instance will cause this error to be issued.

Action:

Do not start receiver server on a root primary instance. Use rootprimary qualifier instead of propagateprimary in the source server command.

PRIMARYNOTROOT

Attempted operation not valid on non-root primary instance xxxx

Severity:

Error

MUPIP Error:

If a replication instance is not a root primary (the journal pool already exists and was created by a source server command that specified propagateprimary), issuing a source server command with the start or deactivate qualifiers that has the rootprimary qualifier explicitly specified (or implicitly assumed) on this instance will cause this error to be issued. This error can also be issued by the receiver server or mupip rollback if the instance that the source server is running on is not a root primary and it connects to a receiver server or a mupip journal -rollback -fetchresync running on an instance that was formerly a root primary and has not yet had a mupip replic -source -losttncomplete command run either explicitly or implicitly on it.

Action:

Use propagateprimary qualifier instead of rootprimary in the source server command. If this error is issued by the receiver server or fetchresync rollback, the secondary instance has to be brought up as the secondary of a root primary since it was a root primary immediately before this. The rule is that any instance that was previously a root primary should be brought up as a secondary of the new root primary. This will create a lost transaction file that needs to be applied on the new root primary. Once that is done, a mupip replic -source -losttncomplete command should be run either explicitly or implicitly on this instance before trying to bring this up as a secondary of a propagating primary.

REPLINSTDBMATCH

Replication instance file xxxx has seqno xxxx while database has a different seqno yyyy

Severity:

Error

MUPIP Error:

This error is issued by the first source server that is started on a replication instance or a mupip journal -rollback command if the journal sequence numbers stored in the instance file does not match that stored in the database file header. This is possible if the database was recreated or refreshed from a backup on another instance without correspondingly recreating the instance file.

Action:

If this instance is not the root primary, this error can be handled by restoring both the database and the instance file from a previous backup (consistent backup of the instance file AND database files taken together at the same time) and restarting the instance. Subsequent to such a restore, all transactions since the last backup will be sent across from this instance's primary. Alternatively, this can be handled by shipping a copy of the database from any other instance (either the primary or any other secondary/tertiary), recreating the instance file and starting this instance as a secondary with the -updateresync qualifier. In either case, this procedure has to be repeated on all tertiary instances etc. that descend from this instance ensuring that for every primary-secondary instance pair, the secondary is not ahead of the primary in terms of journal sequence number. If this instance is the root primary, restoring from a prior backup may not be viable as it may mean loss of transactions that occurred after the backup. The alternative way to handle this error is to recreate the instance file on the root primary, ship a copy of the database from the primary and recreate instance files on ALL secondaries (tertiaries etc.) and restart the secondaries with the -updateresync qualifier. In addition, report this occurrence to Fidelity GT.M Support.

REPLINSTFMT

Format error encountered while reading replication instance file xxxx. Expected yyyy. Found zzzz.

Severity:

Error

Runtime Error:

This error is issued by GT.M or MUPIP whenever it tries to open the replication instance file and finds that it was created with a format that the current version of GT.M cannot interpret

Action:

Recreate the instance file using the mupip replic –instance_create command with the current version of GT.M.

[Note]

The REPLINSTCORRV message that was displayed in V5.0-000 has now been replaced by REPLINSTFMT

REPLINSTNMLEN

Replication instance name xxxx should be 1 to 15 characters long

Severity:

Error

MUPIP Error:

This error is issued by the mupip replic instance_create command if the instance name was specified either through the name qualifier or through the environment variable gtm_repl_instname and if name was longer than 15 characters or was the empty string.

Action:

Specify a valid instance name that is 1 to 15 characters long.

REPLINSTNMSAME

Primary and Secondary instances have the same replication instance name xxxx

Severity:

Error

MUPIP Error:

This error is issued by any source server command where the -instsecondary qualifier specifies a secondary instance name that matches the name of the primary instance the command is started from.

Action:

Two instances should never have the same name. Recreate the instance file on the secondary with a different name and restart the receiver server with the updateresync qualifier.

REPLINSTNMUNDEF

Replication instance name not defined

Severity:

Error

MUPIP Error:

This error is issued by the mupip replic -instance_create command if the -name qualifier was not specified and if the environment variable gtm_repl_instname is not defined either.

Action:

Specify the instance name using the -name qualifier

REPLINSTNOHIST

History record for xxxx not found in replication instance file yyyy

Severity:

Error or Warning

MUPIP Error:

The source server or receiver server issue this message as an error while mupip rollback issues this message as a warning when they scan the replication instance file looking for a history record corresponding to a journal sequence number that is lesser than the earliest sequence number or greater than the latest sequence number stored in the instance file. This means that the replication instance files on the primary and secondary have differing level of history detail (possible if the instance file was later recreated in one instance) and that it is no longer possible to determine the sync point (resync seqno) between the two instances.

Action:

If mupip rollback issues this error, it truncates the replication instance file history. This means that if this instance is a secondary, it should be brought up with the -updateresync qualifier. If the source or receiver server issue this error, this error needs to be handled by ensuring the primary and secondary databases are in sync (by shipping a copy of the database from the primary to the secondary if not already done), recreating the instance file on the secondary (if not already done) and start the receiver server on the secondary with the -updateresync qualifier.

REPLINSTSECLEN

Secondary replication instance name xxxx should be 1 to 15 characters long

Severity:

Error

MUPIP Error:

This error is issued by any mupip replic -source command that specifies a secondary instance name. This error is issued if the secondary instance name was specified either through the -instsecondary qualifier or through the environment variable gtm_repl_instsecondary and if the name was longer than 15 characters or was the empty string.

Action:

Specify a valid secondary instance name that is 1 to 15 characters long.

REPLINSTSECMTCH

Secondary replication instance name xxxx sent by receiver does not match yyyy specified at source server startup

Severity:

Error

MUPIP Error:

This error is issued by a source server that connects to a receiver server on the secondary and finds that the secondary instance name sent by the receiver does not match the secondary instance name specified (INSTSECONDARY qualifier) when the source server was started. The source server terminates after issuing this error.

Action:

Restart the source server with the correct -instsecondary qualifier value. Also make sure the instance name in the -instsecondary qualifier and the host/port information in the secondary qualifier of the source server startup command correspond to each other.

REPLINSTSECNONE

No slot found for secondary instance xxxx in instance file yyyy

Severity:

Error

MUPIP Error:

This error is issued by any mupip replic source command that specifies a secondary instance name (except for the one which specifies –start). There are 16 slots in the instance file to store information pertaining to 16 different secondary instances. When the secondary instance name is specified, an attempt is made to find an unused or already used slot corresponding to the specified secondary instance in the replication instance file. If such a slot is not available, this error is issued.

Action:

Ensure the primary instance does not connect to more than 16 secondaries for the life of the instance file.

REPLINSTSECUNDF

Secondary replication instance name not defined

Severity:

Error

MUPIP Error:

This error is issued by any mupip replic -source command that requires a secondary instance name to be specified. The source server commands that require this qualifier are those that have any of -activate, changelog, deactivate, needrestart, start, statslog or stopsourcefilter specified. The secondary name can be specified either through the INSTSECONDARY qualifier or through the environment variable gtm_repl_instsecondary. If neither of them is specified, this error is issued.

Action:

Specify the secondary instance name using the INSTSECONDARY qualifier.

REPLINSTSEQORD

ssss has seqno xxxx which is less than last record seqno yyyy in replication instance file zzzz

Severity:

Error

MUPIP Error:

This error is issued in one of two scenarios. The instance file consists of a sequence of history records that should correspond to an increasing range of sequence numbers. They need to hence have their starting sequence number in increasing order. If an attempt is made to append a history record with a starting sequence number that is lesser than the last history record currently existing in the instance file, the source or receiver server issues this error. In this case ssss would be the string New history record. This error is also issued if at journal pool creation time, the source server notices that the instance file header has a value of the current seqno that is lesser than the starting seqno of the last history record in the instance file. In this case ssss would be the string Instance file header.

Action:

If this instance is not the root primary, this error can be handled by restoring both the database and the instance file from a previous backup (consistent backup of the instance file AND database files taken together at the same time) and restarting the instance. Subsequent to such a restore, all transactions since the last backup will be sent across from this instance's primary. Alternatively, this can be handled by shipping a copy of the database from any other instance (either the primary or any other secondary/tertiary), recreating the instance file and starting this instance as a secondary with the UPDATERESYNC qualifier. In either case, this procedure has to be repeated on all tertiary instances etc. that descend from this instance ensuring that for every primary-secondary instance pair, the secondary is not ahead of the primary in terms of journal seqno. If this instance is the root primary, restoring from a prior backup may not be viable as it may mean loss of transactions that occurred after the backup. The alternative way to handle this error is to recreate the instance file on the root primary, ship a copy of the database from the primary and recreate instance files on ALL secondaries (tertiaries etc.) and restart the secondaries with the UPDATERESYNC qualifier. In addition, report this occurrence to Fidelity GT.M Support.

REPLINSTSTNDALN

Could not get exclusive access to replication instance file xxxx

Severity:

Error

MUPIP Error:

This error is issued by MUPIP REPLIC INSTANCE_CREATE if it finds that the replication instance file it is attempting to create already exists and is being used (the journal pool for that instance exists) by GTM and/or MUPIP process(es).

Action:

Shutdown all GTM and/or MUPIP processes that are using the replication instance file and reissue the command. If it fails even though you know for sure there is no other GT.M or MUPIP process accessing the replication instance file, delete the instance file and reissue the command.

REPLREQROLLBACK

Replication instance file xxxx indicates abnormal shutdown. Run MUPIP JOURNAL ROLLBACK first.

Severity:

Error

MUPIP Error:

This error is issued by MUPIP REPLIC SOURCE –START if it is about to create the journal pool and finds that the replication instance file header indicates the journal pool was not cleanly shutdown previously. This may cause the instance file not to correspond to the database and/or journals.

Action:

Run MUPIP JOURNAL ROLLBACK to cleanup the instance file, database and journal files before starting a source server on this instance.

REPLUPGRADEPRI

Attempted operation requires primary instance xxxx to support multi-site replication

Severity:

Error

MUPIP Error:

This error is issued if an attempt is made to start an active source server or activate a passive source server on a propagating primary instance while the receiver server on that instance is connected to a primary that has not yet been upgraded to the multi-site version of GT.M. This error is also issued when the receiver server on a propagating primary finds that the primary it connects to does not support multi-site replication and that there is at least one active source server running on the instance at that time.

Action:

An active source server cannot be running on a secondary instance at the same time that the receiver server on this instance is connected to a primary that does not support multi-site functionality. Upgrade the primary instance identified in the message to the version of GT.M that supports multi-site replication functionality and then start active source servers.

REPLUPGRADESEC

Attempted operation requires secondary instance xxxx to support multi-site replication

Severity:

Error

MUPIP Error:

This error is issued in three cases. 1) If a source server is currently connected to a dual-site secondary (i.e. a secondary running on a version of GT.M that does not support multi-site functionality), starting additional source servers will issue this error. 2) If a source server finds more than one source server (active or passive) running on the same instance at the time it connects to a dual-site secondary it will issue this error. 3) On a propagating primary instance, a source server that connects to a dual-site tertiary instance will issue a this error at connection time.

Action:

Upgrade the secondary instance identified in the message to the version of GT.M that supports multi-site replication functionality and then start multiple source servers.

SRCSRVEXISTS

Source server for secondary instance xxxx is already running with pid yyyy

Severity:

Error

MUPIP Error:

This error is issued by a source server startup command if there is already a source server up and running for the secondary instance name specified in the command.

Action:

Do not start multiple source servers for the same secondary instance.

SRCSRVNOTEXIST

Source server for secondary instance xxxx is not alive

Severity:

Error

MUPIP Error:

This error is issued by a mupip replic -source command that specifies any one of activate, changelog, checkhealth, deactivate, shutdown, statslog, stopsourcefilter if it finds no source server up and running for the secondary instance name specified in the command.

Action:

Make sure source server for the specified secondary instance name is up and running before attempting any of the above commands.

SRCSRVTOOMANY

Cannot start more than xxxx source servers in primary instance file yyyy

Severity:

Error

MUPIP Error:

A maximum of 16 active and/or passive source servers are allowed at any point in time per instance. If 16 source servers are already running and another source server startup is attempted, it will issue this error.

Action:

Shutdown any active or passive source server to allow the new source server to start up.

JNLPOOLBADSLOT

Source server slot for secondary instance xxxx is in an inconsistent state. Pid = pppp, State = ssss, SlotIndex = iiii

Severity:

Warning

MUPIP Warning:

This is a debugging message sent to the syslog (operator log) whenever a source server startup or showbacklog command finds a structure in the journal pool holding inconsistent information.

Action:

Forward the information to GT.M Support. No action otherwise necessary. The source server command will automatically fix the inconsistency of that structure.

REPLINSTOPEN

Error opening replication instance file xxxx

Severity:

Error

Runtime Error:

There was an error when GT.M or MUPIP tried to open the replication instance file. The error detail accompanies this message.

Action:

Look at the accompanying error detail. Possible causes are file permissions, system quotas, etc. Fix the cause if possible. If not report to GT.M Support along with the error detail.

REPLINSTCLOSE

Error closing replication instance file xxxx

Severity:

Error

Runtime Error:

There was an error when GT.M or MUPIP tried to close the replication instance file. The error detail accompanies this message.

Action:

Look at the accompanying error detail. Possible causes are file permissions, system quotas, etc. Fix the cause if possible. If not report to GT.M Support along with the error detail.

REPLINSTREAD

Error reading xxxx bytes at offset yyyy from replication instance file ffff

Severity:

Error

Runtime Error:

There was an error when GT.M or MUPIP tried to read from the replication instance file. The error detail accompanies this message.

Action:

Look at the accompanying error detail. Possible causes are file permissions, system quotas, etc. Fix the cause if possible. If not report to GT.M Support along with the error detail.

REPLINSTWRITE

Error writing xxxx bytes at offset yyyy from replication instance file ffff

Severity:

Error

Runtime Error:

There was an error when GT.M or MUPIP tried to write to the replication instance file. The error detail accompanies this message.

Action:

Look at the accompanying error detail. Possible causes are file permissions, system quotas, etc. Fix the cause if possible. If not report to GT.M Support along with the error detail.

REPLINSTFSYNC

Error fsyncing replication instance file xxxx

Severity:

Error

Runtime Error:

There was an error when GT.M or MUPIP tried to sync the replication instance file (using the fsync() call). The error detail accompanies this message.

Action:

Look at the accompanying error detail. Possible causes are file permissions, system quotas, etc. Fix the cause if possible. If not report to GT.M Support along with the error detail.

REPLINSTCREAT

Error creating replication instance file xxxx

Severity:

Error

Runtime Error:

There was an error when GT.M or MUPIP tried to create the replication instance file. The error detail accompanies this message.

Action:

Look at the accompanying error detail. Possible causes are file permissions, system quotas, etc. Fix the cause if possible. If not report to GT.M Support along with the error detail.

Return to top

Typographical Conventions

Command Syntax: UNIX syntax (i.e., lowercase text and "-" for flags/qualifiers) is used throughout this document. OpenVMS accepts both lowercase and uppercase text; flags/qualifiers on OpenVMS should be preceded with "/".

Reference Number: The reference numbers used to track software enhancements and customer support requests appear in parentheses ( ).

Platform Identifier: If a new feature or software enhancement does not apply to all platforms, the relevant platform or platforms appear in brackets [ ].

Return to top

For more information, see the GT.M web site.