Oracle Application Server 10g High Availability Guide 10g (9.0.4) Part Number B10495-01 |
|
Disaster recovery refers to how a system recovers from catastrophic site failures caused by natural or unnatural disasters. Examples of catastrophic failures include earthquakes, tornadoes, floods, or fire. Additionally, disaster recovery can also refer to how a system is managed for planned outages. For most disaster recovery situations, the solution involves replicating an entire site, not just pieces of hardware or subcomponents. This also applies to the Oracle Application Server Disaster Recovery (OracleAS Disaster Recovery) solution.
This chapter describes the OracleAS Disaster Recovery solution, how to configure and set up its environment, and how to manage the solution for high availability. The discussion involves both OracleAS middle and Infrastructure tiers in two sites: production and standby. The standby site is configured identically to the production site. Under normal operation, the production site actively services requests. The standby site is maintained to mirror the applications and content hosted by the production site.
The sites are synchronized using the Oracle Application Server Backup and Recovery Tool (for configuration files in the file system) and Oracle Data Guard (for the Infrastructure database). The following table provides a summary of the OracleAS Disaster Recovery strategy:
In addition to the recovery strategies, configuration and installation of both sites are discussed. For these tasks, two different ways of naming the middle tier nodes are covered as well as two ways of resolving hostnames intra-site and inter-site.
With OracleAS Disaster Recovery, planned outages of the production site can be performed without interruption of service by switching over to the standby site. Unplanned outages are managed by failing over to the standby site. Procedures for switchover and failover are covered in this chapter.
This chapter is organized into the following main sections:
The Oracle Application Server Disaster Recovery solution consists of two identically configured sites - one primary (production/active) and one secondary (standby). Both sites have the same number of middle tier and Infrastructure nodes and the same number and types of components installed. In other words, the installations on both sites, middle tier and Infrastructure are identical. Both sites are usually dispersed geographically, and if so, they are connected via a wide area network.
This section describes the overall layout of the solution, the major components involved, and the configuration of these components. It has the following sections:
Before describing and detailing the OracleAS Disaster Recovery solution, several terms used in this chapter require clear definition in order for the concepts described in this chapter to be understood properly.
For the purpose of discussion in this chapter, a differentiation is made between the terms physical hostname and logical hostname. Physical hostname is used to refer to the "internal name" of the current machine. In UNIX, this is the name returned by the command hostname
.
Physical hostname is used by Oracle Application Server components that are installed on the current machine to reference the local host. During the installation of these components, the installer retrieves the physical hostname from the current machine and stores it in Oracle Application Server configuration metadata on disk.
Logical hostname is a name assigned to an IP address either through the /etc/hosts
file (in UNIX) or through DNS resolution. This name is visible on the network that the host to which it refers to is connected. Often, the logical hostname and physical hostname are literally identical. However, their usage in the OracleAS Disaster Recovery solution necessitates them to be clearly distinct.
Virtual hostname is used to refer to the name for the Infrastructure host that is specified in the Specify High Availibility screen of the OracleAS installer. The virtual hostname is used by the middle tier and Infrastructure components to access the Infrastructure regardless of whether the Infrastructure is a single node installation or part of the OracleAS Cold Failover Cluster solution. Virtual hostname, as used in this chapter, applies only to the Infrastructure host(s).
To ensure that your implementation of the OracleAS Disaster Recovery solution performs as designed, the following requirements need to be adhered to:
Figure 6-1 depicts the topology of the OracleAS Disaster Recovery solution.
The procedures and steps for configuring and operating the OracleAS Disaster Recovery solution support 1 to n number of middle tier installations in the production site. The same number of middle tier installations must exist in the standby site. The middle tiers must mirror each other in the production and standby sites.
For the Infrastructure, a uniform number of installations is not required between the production and standby sites. For example, the Oracle Application Server Cold Failover Clusters solution can be deployed in the production site, and a single node installation of the Infrastructure can be deployed in the standby site. This way, the production site's Infrastructure has protection from host failure using an OracleAS Cold Failover Cluster. Refer to the section "Oracle Application Server Cold Failover Cluster" in Chapter 3 for more information on OracleAS Cold Failover Clusters.
The following are important characteristics of the OracleAS Disaster Recovery solution:
Prior to the the installation of OracleAS software for the OracleAS Disaster Recovery solution, a number of system level configurations are required. The tasks that accomplish these configurations are:
This section covers the steps needed to perform these tasks.
Before performing the steps to set up the physical and logical hostnames, plan the physical and logical hostnames you wish to use with respect to the entire OracleAS Disaster Recovery solution. The overall approach to planning and assigning hostnames is to meet the following goals:
For example, if a middle tier component in the production site uses the name "asmid1
" to reach a host in the same site, the same component in the standby site can use the same name to reach asmid1
's equivalent peer in the standby site.
Although the physical hostnames in the production and standby sites must remain uniform between the two sites, the resolution of these physical hostnames to the correct hosts can be different. The section "Configuring Hostname Resolution" explains more on hostname resolution.
Note:
To illustrate what should be done to plan and assign hostnames, let us use an example as shown in Figure 6-2.
In Figure 6-2, two middle tier nodes exist in the production site. The Infrastructure can be a single node or an OracleAS Cold Failover Cluster solution (represented by a single virtual hostname and a virtual IP, as for a single node Infrastructure). The common names in the two sites are the physical hostnames of the middle tier nodes and the virtual hostname of the Infrastructure. Table 6-2 below details what the physical, logical, and virtual hostnames are in the example:
Physical Hostnames | Virtual Hostname | Logical Hostnames |
---|---|---|
asmid1 |
- |
prodmid1, standbymid1 |
asmid2 |
- |
prodmid2, standbymid2 |
- |
infra |
prodinfra, standbyinfra |
If the hosts in the production site are running non OracleAS applications, and you wish to co-host OracleAS on the same hosts, changing the physical hostnames of these hosts may break these applications. In such a case, you can keep these hostnames in the production site and modify the physical hostnames in the standby site to the same as those in the production site. The non OracleAS applications can then also be installed on the standby hosts so that they can act in a standby role for these applications.
As explained in the section "Terminology", physical, logical, and virtual hostnames have differing purposes in the OracleAS Disaster Recovery solution. They are also set up differently. Information on how the three types of hostnames are set up follow.
The naming of hosts in both the production and standby sites require the changing of the physical hostname in each host.
In Solaris, to change the physical hostname of a host:
prompt> hostname
vi
, to edit the name in /etc/nodename
to your planned physical hostname.
The logical hostnames used in the OracleAS Disaster Recovery solution are defined in DNS. These hostnames are visible in the network that the solution uses and are resolved through DNS to the appropriate hosts via the assigned IP address in the DNS system. You need to add these logical hostnames and their corresponding IP addresses to the DNS system.
Using the example in Figure 6-2, the following should be the additions made to the DNS system serving the entire network that encompasses the production and standby sites:
prodmid1.oracle.com IN A 123.1.2.333 prodmid2.oracle.com IN A 123.1.2.334 prodinfra.oracle.com IN A 123.1.2.111 standbymid1.oracle.com IN A 213.2.2.443 standbymid2.oracle.com IN A 213.2.2.444 standbyinfra.oracle.com IN A 213.2.2.210
As defined in the Terminology section, virtual hostname applies to the Infrastructure only. It is specified during installation of the Infrastructure. When you run the Infrastructure installation type, a screen called "Specify High Availibility" appears to provide a textbox to enter the virtual hostname of the Infrastructure that is being installed. Refer to the Oracle Application Server 10g Installation Guide for more details.
For the example in Figure 6-2, when you install the production site's Infrastructure, enter its virtual hostname, "infra
", when you see the Specify High Availibility screen. Enter the same virtual hostname when you install the standby site's Infrastructure.
In the Oracle Application Server Disaster Recovery solution, one of two ways of hostname resolution can be configured to resolve the hostnames you planned and assigned in the previous section. These are:
In UNIX, the order of the method of name resolution can be specified using the "hosts
" parameter in the file /etc/nsswitch.conf
. The following is an example of the hosts
entry:
hosts: files dns nis
In the above statement, local hostnaming file resolution is preferred over DNS and NIS (Network Information Service) resolutions. When a hostname is required to be resolved to an IP address, the /etc/hosts
file is consulted first. In the event that a hostname cannot be resolved using local hostnaming resolution, DNS is used. (NIS resolution is not used for the OracleAS Disaster Recovery solution.) Refer to your UNIX system's documentation if you wish to find out more about /etc/nsswitch.conf
.
This method of hostname resolution relies on a local hostnaming file to contain the requisite hostname-to-IP address mappings. In UNIX, this file is /etc/hosts
.
To use the local hostnaming file to resolve hostnames for the OracleAS Disaster Recovery solution in UNIX, for each middle tier and Infrastructure host in both the production and standby sites, perform the following:
vi
, to edit the /etc/nsswitch.conf
file. With the "hosts:
" parameter, specify "files
" as the first choice for hostname resolution.
/etc/hosts
file to include the following:
For example, if you are editing the /etc/hosts
file of a middle tier node in the production site, enter all the middle tier physical hostnames and their IP addresses in the production site beginning the list with the current host. (Note that you should also include fully qualified hostnames in addition to the abbreviated hostnames. See Table 6-3.)
For example, if you are editing the /etc/hosts
of a middle tier node in the standby site, enter the virtual hostname, fully qualified and abbreviated, and IP address of the Infrastructure host in the standby site.
For the example in Figure 6-2, on the asmid1
host, use the following commands in UNIX:
ping asmid1
The returned IP address should be 123.1.2.333.
ping asmid2
The returned IP address should be 123.1.2.334.
ping infra
The returned IP address should be 123.1.2.111.
Using the example in Figure 6-2, Table 6-3 contains the required entries in /etc/hosts
file of each host.
To set up the OracleAS Disaster Recovery solution to use DNS hostname resolution, site-specific DNS servers must be set up in the production and standby sites in addition to the overall corporate DNS servers (usually more than one DNS server exists in a corporate network for redundancy). Figure 6-3 provides an overview of this setup.
See Also:
Appendix A, "Setting Up a DNS Server" for instructions on how to set up a DNS server in UNIX |
For the above topology to work, the following requirements and assumptions are made:
/etc/hosts
file in each host does not contain any entries for the physical, logical, or virtual hostnames of any host in either site.
To set up the OracleAS Disaster Recovery solution for DNS resolution:
prodmid1.oracle.com IN A 123.1.2.333 prodmid2.oracle.com IN A 123.1.2.334 prodinfra.oracle.com IN A 123.1.2.111 standbymid1.oracle.com IN A 213.2.2.443 standbymid2.oracle.com IN A 213.2.2.444 standbyinfra.oracle.com IN A 213.2.2.210
oracleas
" for the domain name for the two sites in Figure 6-2. The high level corporate domain name is oracle.com
.
For the example in Figure 6-2, the entries are as follows:
For the production site's DNS:
asmid1.oracleas IN A 123.1.2.333 asmid2.oracleas IN A 123.1.2.334 infra.oraclas IN A 123.1.2.111
For the standby site's DNS:
asmid1.oracleas IN A 213.2.2.443 asmid2.oracleas IN A 213.2.2.444 infra.oracleas IN A 213.2.2.210
Note:
If you are using the OracleAS Cold Failover Cluster solution for the Infrastructure in either site, enter the cluster's virtual hostname and virtual IP address. For example, in the previous step above, |
Because Oracle Data Guard technology is used to synchronize the production and standby Infrastructure databases, the production Infrastructure must be able to reference the standby Infrastructure and vice versa.
For this to work, the IP address of the standby Infrastructure host must be entered in the production site's DNS server with a unique hostname with respect to the production site. Similarly, the IP address of the production Infrastructure host must be entered in the standby site's DNS server with the same hostname. The reason for these DNS entries is that Oracle Data Guard uses TNS Names to direct requests to the production and standby Infrastructures. Hence, the appropriate entries must be made to the tnsnames.ora
file as well.
Using the example in Figure 6-2 and assuming that the selected name for the remote Infrastructure is "remoteinfra
", the entries in the DNS server in the production site are:
asmid1.oracleas IN A 123.1.2.333 asmid2.oracleas IN A 123.1.2.334 infra.oracleas IN A 123.1.2.111 remoteinfra.oracleas IN A 213.2.2.210
And, for the standby site, its DNS server should have the following entries:
asmid1.oracleas IN A 213.2.2.443 asmid2.oracleas IN A 213.2.2.444 infra.oracleas IN A 213.2.2.210 remoteinfra.oracleas IN A 123.1.2.111
Oracle Data Guard sends redo data across the network to the standby system using OracleNet. SSH tunneling should be used with Oracle Data Guard as an integrated way to encrypt and compress the redo data before it is transmitted by the production system and subsequently decrypt and uncompress the redo data when it is received by the standby system.
See Also:
|
This section provides an overview of the steps for installing the OracleAS Disaster Recovery solution. After following the instructions in the previous section to set up the environment for the solution, go through this section for an overview of the installation process. Thereafter, follow the detailed instructions in the Oracle Application Server 10g Installation Guide to install the solution.
The following is the overall sequence for installing the OracleAS Disaster Recovery solution:
Note the following important points when you perform the installation:
asmid1
host in the standby site must use the same ports as the asmid1
host in the production site. Utilize a static ports definition file for this purpose (see note above and the next point).
For each middle tier host, use the following syntax:
runInstaller oracle.iappserver.iapptop:s_staticPorts=staticports.ini
For each Infrastructure host, use the following syntax:
runInstaller oracle.iappserver.infrastructure:s_staticPorts=staticports.ini
For OracleAS Disaster Recovery purposes, the metadata information maintained within the the Infrastructure database is kept in synchronization by utilizing Oracle Data Guard technology. This technology propagates all database changes at the production site to the standby site for disaster tolerance.
Note that for OracleAS Disaster Recovery, archive logs are shipped from the production Infrastructure database to the standby Infrastructure database but are not applied. The application of these logs have to be done with the synchronization of file system configuration information, which is discussed in the section "Backing Up Configuration Files (Infrastructure and Middle Tier)".
The setup of Oracle Data Guard for OracleAS Disaster Recovery involves the following steps:
By default, the production database does not have ARCHIVELOG
mode enabled. However, it needs to be in ARCHIVELOG
mode in order to ship archive logs to the standby database. The default destination directory for archive logs is:
<INFRA_ORACLE_HOME>/dbs/arch/
To enable ARCHIVELOG
mode:
ORACLE_HOME
and ORACLE_SID
(the default is asdb
) environment variables are properly set.
<ORACLE_HOME>/bin/emctl stop iasconsole <ORACLE_HOME>/opmn/bin/opmnctl stopall
<ORACLE_HOME>/bin/emctl status iasconsole
ARCHIVELOG
mode is not enabled:
<ORACLE_HOME>/bin/sqlplus /nolog SQL> connect sys/<password> as sysdba SQL> archive log list Database log mode No Archive Mode Automatic archival Disabled Archive destination /private/oracle/oracleas/dbs/arch Oldest online log sequence 4 Current log sequence 6
SQL> shutdown immediate
SQL> startup mount;
ARCHIVELOG
mode.
SQL> alter database archivelog; SQL> alter system set log_archive_start=true scope=spfile; SQL> alter system set LOG_ARCHIVE_DEST_1=
'LOCATION=/private/oracle/oracleas/oradata MANDATORY' SCOPE=BOTH;
SQL> shutdown SQL> connect sys/<password> as sysdba SQL> startup
ARCHIVELOG
mode.
Execute the following command and verify that Database log mode
is in Archive Mode
and Automatic archival
is Enabled
.
SQL> archive log list Database log mode Archive Mode Automatic archival Enabled Archive destination /private/oracle/oracleas/oradata Oldest online log sequence 4 Next log sequence to archive 6 Current log sequence 6
On the production database, query the V$DATAFILE
view to list the files that will be used to create the physical standby database as follows:
SQL> SELECT NAME FROM V$DATAFILE; NAME -------------------------------------------------------------------------------- /private/oracle/oracleas/oradata/asdb/system01.dbf /private/oracle/oracleas/oradata/asdb/undotbs01.dbf /private/oracle/oracleas/oradata/asdb/drsys01.dbf /private/oracle/oracleas/oradata/asdb/dcm.dbf /private/oracle/oracleas/oradata/asdb/portal.dbf . . . 24 rows selected.
On the production database, perform the following steps to make a closed backup copy of the production database:
SQL> SHUTDOWN IMMEDIATE;
cp
command (ORACLE_HOME is /private/oracle/oracleas
):
> mkdir /private/standby > cp /private/oracle/oracleas/oradata/asdb/system01.dbf /private/standby/system01.dbf > cp /private/oracle/oracleas/oradata/asdb/undotbs01.dbf /private/standby/undotbs01.dbf
> cp /private/oracle/oracleas/oradata/asdb/drsys01.dbf /private/standby/drsys01.dbf > cp /private/oracle/oracleas/oradata/asdb/dcm.dbf /private/standby/dcm.dbf
> cp /private/oracle/oracleas/oradata/asdb/portal.dbf /private/standby/portal.dbf . . .
SQL> STARTUP;
On the production database, create the control file for the standby database, as shown in the following example:
SQL> alter database create standby controlfile as '/private/standby/asdb.ctl';
The filename for the newly created standby control file must be different from the filename of the current control file of the production database. In the example above, it is created in the new temporary directory for the standby. The control file must also be created after the last time stamp for the backup datafiles.
Create a traditional text initialization parameter file from the server parameter file used by the production database; a traditional text initialization parameter file can be copied to the standby location and modified. For example:
SQL> create pfile='/private/standby/initasdb.ora' from spfile
Later, in the section "Create a Server Parameter File for the Standby Database" and thereafter, you will convert this file back to a server parameter file after it is modified to contain the parameter values appropriate for use with the physical standby database.
On the production system, use an operating system copy utility to copy the following binary files from the production system to the standby system:
opmnctl
and emctl
as specified in the section "Enable ARCHIVELOG Mode for Production Database".
standby> rm /private/oracle/oracleas/dbs/spfileasdb.ora standby> rm /private/oracle/oracleas/dbs/orapwasdb standby> rm /private/oracle/oracleas/oradata/asdb/*.*
standby> ls -l initasdb.ora lrwxrwxrwx 1 nedcias svrtech 54 Nov 10 09:25 initasdb.ora -> /private/oracle/oracleas/admin/asdb/pfile/initasdb.ora standby> rm /private/oracle/oracleas/admin/asdb/pfile/initasdb.ora
production> cd /private/standby production> cp initasdb.ora /net/standby/private/oracle/oracleas/admin/asdb/pfile/initasdb.ora
production> cp asdb.ctl /net/standby/private/oracle/oracleas/oradata/asdb/control01.ctl production> cp asdb.ctl /net/standby/private/oracle/oracleas/oradata/asdb/control02.ctl production> cp asdb.ctl /net/standby/private/oracle/oracleas/oradata/asdb/control03.ctl
production> cp *.dbf /net/standby/private/oracle/oracleas/oradata/asdb/.
Although most of the initialization parameter settings in the text initialization parameter file that you copied from the production system are also appropriate for the physical standby database, some modifications need to be made.
The following steps detail the parameters to modify or add to the standby initialization parameter file:
initasdb.ora
file that was copied over from the production system:
*.standby_archive_dest
- Specify the location of the archived redo logs that will be received from the production database.
*.standby_file_management
- Set to AUTO
.
*.remote_archive_enable
- Set to TRUE
.
For example:
*.standby_archive_dest='/private/oracle/oracleas/standby/' *.standby_file_management=AUTO *.remote_archive_enable=TRUE
standby_archive_dest
parameter
Standby> mkdir /private/oracle/oracleas/standby
ORACLE_HOME
and ORACLE_SID
(the default is asdb
) environment variables are properly set. For example:
Standby> setenv ORACLE_HOME /private/oracle/oracleas Standby> setenv ORACLE_SID asdb
If the standby system is running on a Windows system, use the ORADIM utility to create a Windows Service. For example:
WINNT> oradim -NEW -SID payroll2 -STARTMODE manual
A new password file has to be created on the standby system. The following is an example:
standby> cd /private/oracle/oracleas/dbs standby> /private/oracle/oracleas/bin/orapwd file=orapwasdb password=<passwd>
On both the production and standby sites, the Oracle Net Manager can be used to configure a listener for the respective databases. This is completed during the installation of the Infrastructure. During the installation process, the listeners were configured and started. The following is the configuration file that maintains the listener configuration information:
<ORACLE_HOME>/network/admin/listener.ora
Any modifications to this file requires the listeners to be restarted using the following commands:
% lsnrctl stop % lsnrctl start
Enable dead connection detection by setting the SQLNET.EXPIRE_TIME
parameter to 2 in the SQLNET.ORA
parameter file on the standby system. For example:
SQLNET.EXPIRE_TIME=2
On both the production and standby systems, use Oracle Net Manager to create a network service name for the production and standby databases that is to be used by log transport services.
The Oracle Net service name must resolve to a connect descriptor that uses the same protocol, host address, port, and SID that are specified in the listener configuration file listener.ora
. The connect descriptor must also specify that a dedicated server be used.
The following steps illustrate how the above should be set up:
n
entry to point at the local database as well as the remote copy. Execute the TNSPING
command on both nodes to confirm that the entries in the TNSNAMES.ORA
file are correct.
TNSNAMES.ORA
file on the production database host that points to the standby database (the standby hostname in this example is standby.oracle.com
):
ASDB_REMOTE = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = standby.oracle.com)(PORT=1521)) ) (CONNECT_DATA = (SERVICE_NAME = asdb.oracle.com) ) )
TNSPING
the remote host to verify that it can be reached:
production> /private/oracle/oracleas/bin/tnsping asdb_remote
TNSNAMES.ORA
file on the standby host that points to the production database (the production host in this example is assumed to be production.oracle.com
):
ASDB_REMOTE = (DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = production.oracle.com)(PORT=1521)) ) (CONNECT_DATA = (SERVICE_NAME = asdb.oracle.com) ) )
TNSPING
the production host to verify that it can be reached:
standby> /private/oracle/oracleas/bin/tnsping asdb_remote
On an idle standby database, use the SQLPLUS CREATE
statement to create a server parameter file for the standby database from the text initialization parameter file that was edited in the section "Set Initialization Parameters for the Physical Standby Database". For example:
ORACLE_HOME
and ORACLE_SID
(the default is asdb
) environment variables are properly set
Standby> setenv ORACLE_HOME /private/oracle/oracleas Standby> setenv ORACLE_SID asdb
<ORACLE_HOME>/bin/sqlplus /nolog SQL> connect sys/password as sysdba SQL> create spfile='/private/oracle/oracleas/dbs/spfileasdb.ora' from pfile='/private/oracle/oracleas/dbs/initasdb.ora';
On the standby database,issue the following SQLPLUS statements to start and mount the database in standby mode:
SQL> STARTUP NOMOUNT; SQL> ALTER DATABASE MOUNT STANDBY DATABASE;
This section describes the minimum amount of work you must do on the production database to set up and enable archiving to the physical standby database.
To configure archive logging from the production database to the standby site the LOG_ARCHIVE_DEST_n
and LOG_ARCHIVE_DEST_STATE_n
parameters must be defined. The service name used must be the same as that set up in the "Create Oracle Net Service Names" section.
The following statements executed on the production database set the initialization parameters needed to enable archive logging to the standby site:
SQL> alter system set log_archive_dest_2='SERVICE=asdb_remote' scope=both; SQL> alter system set log_archive_dest_state_2=enable scope=both;
Archiving of redo logs to the remote standby location does not occur until after a log switch. A log switch occurs, by default, when an online redo log becomes full.
To force the current redo logs to be archived immediately, use the SQLPLUS ALTER SYSTEM
statement on the production database:
SQL> ALTER SYSTEM ARCHIVE LOG CURRENT;
Once you create the physical standby database and set up log transport services, you should verify that database modifications are being successfully shipped from the production database to the standby database.
To see the new archived redo logs that were received on the standby database, you should first identify the existing archived redo logs on the standby database, archive a few logs on the production database, and then check the standby database again. The following steps illustrate how to perform these tasks:
On the standby database, query the V$ARCHIVED_LOG
view to identify existing archived redo logs:
SQL> SELECT SEQUENCE#, FIRST_TIME, NEXT_TIME FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#; SEQUENCE# FIRST_TIME NEXT_TIME ---------- ------------------ ------------------ 8 11-JUL-02 17:50:45 11-JUL-02 17:50:53 9 11-JUL-02 17:50:53 11-JUL-02 17:50:58 10 11-JUL-02 17:50:58 11-JUL-02 17:51:03 3 rows selected.
On the production database, archive the current log using the following SQLPLUS statement:
SQL> ALTER SYSTEM ARCHIVE LOG CURRENT;
On the standby database, query the V$ARCHIVED_LOG
view to verify that the redo log has been received using the following statement:
SQL> SELECT SEQUENCE#, FIRST_TIME, NEXT_TIME FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#; SEQUENCE# FIRST_TIME NEXT_TIME ---------- ------------------ ------------------ 8 11-JUL-02 17:50:45 11-JUL-02 17:50:53 9 11-JUL-02 17:50:53 11-JUL-02 17:50:58 10 11-JUL-02 17:50:58 11-JUL-02 17:51:03 11 11-JUL-02 17:51:03 11-JUL-02 18:34:11 4 rows selected.
The logs have now been shipped and are available on the standby database. To confirm that they are there, list out the contents of the directory <ORACLE_HOME>/standby
.
On the standby database, query the V$ARCHIVED_LOG
view to verify that the archived redo log has not been applied.
SQL> SELECT SEQUENCE#,APPLIED FROM V$ARCHIVED_LOG ORDER BY SEQUENCE#; SEQUENCE# APP --------- --- 8 NO 9 NO 10 NO 11 NO 4 rows selected.
Once Oracle Data Guard has been set up between the production and standby sites, the procedure for synchronizing the two sites can be carried out. An initial synchronization should be done, before the production site is used, in order to obtain a baseline snapshot of the post-installation production site onto the standby site. This baseline can then be used to recover the production site configuration on the standby site if needed later.
In order to obtain a consistent point-in-time snapshot of the production site, the information stored in the Infrastructure database and the Oracle Application Server-related configuration files in the middle tier and Infrastructure hosts must be synchronized at the same time. Synchronization of the configuration files can be done by backing up the files and restoring them on the standby hosts using the Oracle Application Server Backup and Recovery Tool. For the Infrastructure database, synchronization is done using Oracle Data Guard by shipping the archive logs to the standby Infrastructure and applying these logs in coordination with the restoration of the configuration files.
The sequence of steps for the baseline synchronization (which can also be used for future synchronizations) are:
These steps are detailed in the following two main sections.
The main strategy and approach to synchronizing configuration information between the production and standby sites is to synchronize the backup of Infrastructure and middle tier configuration files with the application of log information on the standby Infrastructure database.
For Oracle Application Server, not all the configuration information is in the Infrastructure database. The backup of the database files needs to be kept synchronized with the backup of the middle tier and Infrastructure configuration files. Due to this, log-apply services should not be enabled on the standby database. The log files from the production Infrastructure are shipped to the standby Infrastructure but are not applied.
The backup process of the production site involves backing up the configuration files in the middle tier and Infrastructure nodes. Additionally, the archive logs for the Infrastructure database are shipped to the standby site.
The procedures to perform the backups and the log ship are discussed in the following sections:
At the minimum, the backup and restoration steps discussed in this section and the "Restoring to Standby Site" section should be performed whenever there is any administration change in the production site (inclusive of changes to the Infrastructure database and configuration files on the middle tier and Infrastructure nodes). On top of that, scheduled regular backups and restorations should also be done (for example, on a daily or twice weekly basis). See the Oracle Application Server 10g Administrator's Guide for more backup and restore procedures.
Note:
After installing the OracleAS Disaster Recovery solution, Oracle Data Guard should have been installed in both the production and standby databases. The steps for shipping the archive logs from the production Infrastructure database to the standby Infrastructure database involve configuring Oracle Data Guard and executing several commands for both the production and standby databases. Execute the following steps to ship the logs for the Infrastructure database:
SQL> alter database recover managed standby database cancel;
SQL> alter system switch logfile;
SQL> select first_change# from v$log where status='CURRENT';
A SCN or sequence number is returned, which essentially represents the timestamp of the transported log.
Continue on to the next section to back up the configuration files on the middle tier host(s) and Infrastructure host.
Use the instructions in this section to back up the configuration files. The instructions require the use of the Oracle Application Server Backup and Recovery Tool. They assume you have installed and configured the tool on each OracleAS installation (middle tier and Infrastructure) as it needs to be customized for each installation. Refer to Oracle Application Server 10g Administrator's Guide for more details about that tool, including installation and configuration instructions.
For each middle tier and Infrastructure installation, perform the following steps (the same instructions can be used for the middle tier and Infrastructure configuration files):
oracle_home
, log_path
, and config_backup_path
in the tool's configuration file, config.inp
, should have the appropriate values. Also, the following command for the tool should have been run to complete the configuration:
perl bkp_restore.pl -m configure_nodb
If you have not completed these tasks, do so before continuing with the ensuing steps.
perl bkp_restore.pl -v -m backup_config
This command creates a directory in the location specified by the config_backup_path
variable specified in the config.inp
file. The directory name includes the time of the backup. For example: config_bkp_2003-09-10_13-21
.
log_path
variable in the config.inp
file. Check the log files for any errors that may have occurred during the backup process.
config_backup_path
) from the current node to its equivalent in the standby site. Ensure that the path structure on the standby node is identical to that on the current node.
After backing up the configuration files from the middle tier Oracle Application Server instances and Infrastructure together with the Infrastructure database, restore the files and database in the standby site using the instructions in this section, which consists of the following sub-sections:
Restoring the backed up files from the production site requires the Oracle Application Server Backup and Recovery Tool that was used for the backup. The instructions in this section assume you have installed and configured the tool on each OracleAS installation in the standby site, both in the middle tier and Infrastructure nodes. Refer to Oracle Application Server 10g Administrator's Guide for instructions on how to install the tool.
For each middle tier and Infrastructure installation in the standby site, perform the following steps (the same instructions can be used for the middle tier and Infrastructure configuration files):
opmnctl stopall
Check that all relevant processes are no longer running. In UNIX, use the following command:
ps -ef | grep <ORACLE_HOME>
This can be accomplished either by configuring the Backup and Recovery Tool for the Oracle home or copying the backup configuration file, config.inp
, from the production site peer. Below is an example of running the Backup and Recovery Tool configuration option:
perl bkp_restore.pl -v -m configure_nodb
perl bkp_restore.pl -v -m restore_config
perl bkp_restore.pl -v -m restore_config -t <backup_directory>
where <backup_directory> is the name of the directory with the backup files that was copied from the production site. For example, this could be config_bkp_2003-09-10_13-21
.
config.inp
for any errors that may have occurred during the restoration process.
During the backup phase, you executed several instructions to ship the database log files from the production site to the standby site up to the SCN number that you recorded as per instructed. To restore the standby database to that SCN number, apply the log files to the standby Infrastructure database using the following SQLPLUS statement:
SQL> alter database recover automatic from '/private/oracle/oracleas/standby/' standby database until change <SCN>;
With this command executed and the instructions to restore the configuration files completed on each middle tier and Infrastructure installation, the standby site is now synchronized with the production site. However, there are two common problems that can occur during the application of the log files: errors caused by the incorrect specification of the path and gaps in the log files that have been transported to the standby site.
The following are methods of resolving these problems:
On the standby Infrastructure database, try to determine location and number of received archive logs using the following SQLPLUS statement:
SQL> show parameter standby_archive_dest NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ standby_archive_dest string /private/oracle/oracleas/standby/
At the standby Infrastructure database, perform the following:
standby> cd /private/oracle/oracleas/standby standby> ls 1_13.dbf 1_14.dbf 1_15.dbf 1_16.dbf 1_17.dbf 1_18.dbf 1_19.dbf
At the production Infrastructure database, execute the following SQLPLUS statement:
SQL> show parameter log_archive_dest_1 NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ log_archive_dest_1 string LOCATION=/private/oracle/oracleas/oradata MANDATORY log_archive_dest_10 string
production> cd /private/oracle/oracleas/oradata production> ls 1_10.dbf 1_12.dbf 1_14.dbf 1_16.dbf 1_18.dbf asdb 1_11.dbf 1_13.dbf 1_15.dbf 1_17.dbf 1_19.dbf
In the above example, note the discrepency where the standby Infrastructure is missing files 1_10.dbf
through 1_12.dbf
. Since this gap in the log files happened in the past, it could be due to a problem with the historic setup involving the network used for the log transport. This problem has obviously been corrected and subsequent logs have been shipped. To correct the problem, copy (FTP) the log files to the corresponding directory on the standby Infrastructure database host and re-attempt the SQLPLUS recovery statement shown earlier in this section.
Scheduled outages are planned outages. They are required for regular maintenance of the technology infrastructure supporting the business applications and include tasks such as hardware maintenance, repair and upgrades, software upgrades and patching, application changes and patching, and changes to improve performance and manageability of systems. Scheduled outages can occur either for the production or standby site. Descriptions of scheduled outages that impact the production or standby site are:
The entire site where the current production resides is unavailable. Examples of site-wide maintenance are scheduled power outages, site maintenance, and regular planned switchovers.
This is scheduled downtime of the OracleAS Cold Failover Cluster for hardware maintenance. The scope of this downtime is the whole hardware cluster. Examples of cluster-wide maintenance are repair of the cluster interconnect and upgrade of the cluster management software.
For scheduled outages, a site switchover has to be performed, which is explained in the following section.
A site switchover is performed for planned outages of the production site. Both the production and standby sites have to be available during the switchover. The application of the database redo logs is synchronized to match the backup and restoration of the configuration files for the middle tier and Infrastructure installations.
During site switchover, considerations to avoid long periods of cached DNS information have to be made. Modifications to the site's DNS information, specifically time-to-live (TTL), have to performed. Refer to "Manually Changing DNS Names" for instructions.
To switchover from the production to standby site, perform the following:
SQL> alter database recover managed standby database disconnect from session;
opmnctl stopall
To stop the CJQ0 process, run the following query for the production and standby databases:
SQL> ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0;
To stop the QMN0 process, run the following query for the production and standby databases:
SQL> ALTER SYSTEM SET AQ_TM_PROCESSES=0;
(The changes effected by the above statements need not require a database restart.)
On the production database, perform the following:
On the current production database, query the SWITCHOVER_STATUS
column of the V$DATABASE
fixed view on the production database to verify that it is possible to perform a switchover operation. For example:
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE; SWITCHOVER_STATUS ----------------- TO STANDBY 1 row selected
The TO STANDBY
value in the SWITCHOVER_STATUS
column indicates that it is possible to switch the production database to the standby role. If the TO STANDBY
value is not displayed, then verify that the Data Guard configuration is functioning correctly (for example, verify that all LOG_ARCHIVE_DEST_n
parameter values are specified correctly).
To transition the current production database to a physical standby database role, use the following SQLPLUS statement on the production database:
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY;
After this statement completes, the production database is converted into a standby database. The current control file is backed up to the current SQLPLUS session trace file before the switchover operation. This makes it possible to reconstruct a current control file, if necessary.
Shut down the former production instance and restart it without mounting the database:
SQL> SHUTDOWN IMMEDIATE; SQL> STARTUP NOMOUNT; SQL> ALTER SYSTEM SET STANDBY_ARCHIVE_DEST='/private/oracle/oracleas/standby/' SCOPE=BOTH; SQL> ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT='auto' SCOPE=BOTH;
Mount the database as a physical standby database:
SQL> ALTER DATABASE MOUNT STANDBY DATABASE;
Create the standby archive destination directory:
mkdir /private/oracle/oracleas/standby/
At this point in the switchover process, both databases are configured as standby databases.
On the original standby database, perform the following:
V$DATABASE
view.
After you transition the production database to the physical standby role and the switchover notification is received by the standby databases in the configuration, you should verify if the switchover notification was processed by the original standby database by querying the SWITCHOVER_STATUS
column of the V$DATABASE
fixed view on the original standby database.
For example:
SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE; SWITCHOVER_STATUS ----------------- SWITCHOVER PENDING 1 row selected
The SWITCHOVER PENDING
value of the SWITCHOVER_STATUS
column indicates the standby database is about to switch from the standby role to the production role. If the SWITCHOVER PENDING
value is not displayed and the TO PRIMARY value is displayed, this indicates all redo has been received and applied and the standby is now a candidate for switchover to a production role. Verify that the Data Guard configuration is functioning correctly (for example, verify that all LOG_ARCHIVE_DEST_n
parameter values are specified correctly).
You can switch a physical standby database from the standby role to the production role when the standby database instance is either mounted in managed recovery mode or open for read-only access. It must be mounted in one of these modes so that the production database switchover operation request can be coordinated.
The SQL ALTER DATABASE
statement used to perform the switchover automatically creates online redo logs if they do not already exist. Use the following SQLPLUS statements on the physical standby database that you want to transition to the production role:
SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY; SQL> SHUTDOWN IMMEDIATE; SQL> STARTUP MOUNT; SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_2='SERVICE=asdb_remote' SCOPE=BOTH; SQL> ALTER SYSTEM SET LOG_ARCHIVE_DEST_2=enable SCOPE=BOTH; SQL> ALTER DATABASE OPEN;
Shut down the original standby instance and restart it using the appropriate initialization parameters for the production role:
SQL> SHUTDOWN; SQL> STARTUP;
The original physical standby database is now transitioned to the production database role.
Issue the following statement on the new production database:
SQL> ALTER SYSTEM ARCHIVE LOG CURRENT;
> opmnctl startall
An unplanned outage that impacts either or both the production and standby sites can be one of the following:
Unplanned outages warrant the failover of the production site to the standby site. Configuration files restoration and an Oracle Data Guard failover operation are required. Failover restores the Oracle Application Server environment to the point of the last successful backup.
A site failover is performed for unplanned outages for the production site. Failover operations require the restoration on the standby site of the last backup of the configuration files of all hosts and the synchronized application of equivalent point-in-time redo logs (using the correct SCN number) to the standby database.
To failover the production site to the standby site:
SQL> alter database recover automatic from '/private/oracle/oracleas/standby/' standby database until change <SCN>;
Once the statement in the previous step completes successfully, the standby database is recovered to a consistent level with the configuration files that were restored in step 1.
Execute the following statements to transform the standby database to the production role:
SQL> connect sys/internal as sysdba SQL> select OPEN_MODE, STANDBY_MODE, DATABASE_ROLE from v$database; OPEN_MODE STANDBY_MOD DATABASE_ROLE ---------- ----------- ---------------- MOUNTED UNPROTECTED PHYSICAL STANDBY SQL> alter database activate standby database; Database altered. SQL> alter database mount; Database altered. SQL> select OPEN_MODE, STANDBY_MODE, DATABASE_ROLE from v$database; OPEN_MODE STANDBY_MOD DATABASE_ROLE ---------- ----------- --------------- MOUNTED UNPROTECTED PRIMARY SQL> alter database open resetlogs; alter database open resetlogs * ERROR at line 1: ORA-01139: RESETLOGS option only valid after an incomplete database recovery
To complete the failover operation, you need to shut down the new production database and restart it in read/write mode using the proper traditional initialization parameter file (or server parameter file) for the production role:
SQL> SHUTDOWN IMMEDIATE; SQL> STARTUP; ORACLE instance started. Total System Global Area 143427356 bytes Fixed Size 280348 bytes Variable Size 92274688 bytes Database Buffers 50331648 bytes Redo Buffers 540672 bytes Database mounted. Database opened. SQL> select OPEN_MODE, STANDBY_MODE, DATABASE_ROLE from v$database; OPEN_MODE STANDBY_MOD DATABASE_ROLE ---------- ----------- ---------------- READ WRITE UNPROTECTED PRIMARY
After starting the new production database, a new standby site needs to be created. The steps for performing this are documented in this chapter starting from the section "Setting Up Oracle Data Guard" to the section "Backing Up Configuration Files (Infrastructure and Middle Tier)".
Once a new standby site has been established, a planned switchover may be performed to migrate production quality processing to the correct geographical site. Perform the steps in the section "Site Switchover Operations".
In order for client requests to be directed to the entry point of a production site, DNS resolution is used. When a site switchover or failover is performed, client requests have to be redirected transparently to the new site playing the production role. To accomplish this redirection, the wide area DNS that resolves requests to the production site has to be switched over to the standby site. The DNS switchover can be accomplished in one of the following two ways:
Note: A hardware load balancer is assumed to be front-ending each site. Check http://metalink.oracle.com for supported load balancers. |
When a wide area load balancer (global traffic manager) is deployed in front of the production and standby sites, it provides fault detection services and performance-based routing redirection for the two sites. Additionally, the load balancer can provide authoritative DNS name server equivalent capabilities.
During normal operations, the wide area load balancer can be configured with the production site's load balancer name-to-IP mapping. When a DNS switchover is required, this mapping in the wide area load balancer is changed to map to the standby site's load balancer IP. This allows requests to be directed to the standby site, which should have been brought up and now has the production role.
This method of DNS switchover works for both site switchover and failover. One advantage of using a wide area load balancer is that the time for a new name-to-IP mapping to take effect can be almost immediate. The downside is that an additional investment needs to be made for the wide area load balancer.
This method of DNS switchover involves the manual change of the name-to-IP mapping that is originally mapped to the IP address of the production site's load balancer. The mapping is changed to map to the IP address of the standby site's load balancer. Follow these instructions to perform the task:
This method of DNS switchover works for planned site switchovers only. The TTL value set in step 2 should be a reasonable time period where client requests cannot be fulfilled. The modification of the TTL is effectively modifying the caching semantics of the address resolution from a long period of time to a short period. Due to the shortened caching period, an increase in DNS requests can be observed.
|
![]() Copyright © 2003 Oracle Corporation. All Rights Reserved. |
|