Oracle® Application Server 10g High Availability Guide
10g (9.0.4) Part No. B10495-02 |
|
![]() |
![]() |
This section provides instructions to manage your Infrastructure’s high availability environment. Instructions for operations such as stopping, starting, and recovering from scheduled and unplanned outages are provided. The two high availability solutions are discussed:
Oracle Application Server Active Failover Cluster (UNIX)
Note: For details on installing for high availability, refer to the Oracle Application Server 10g Installation Guide. |
The instructions in this section detail the steps for starting and stopping the OracleAS Infrastructure in an OracleAS Cold Failover Cluster.
Use the following steps to start the Infrastructure in an OracleAS Cold Failover Cluster:
Set the ORACLE_HOME
environment variable to the Infrastructure’s Oracle home.
Set the ORACLE_SID
environment variable to the metdata repository’s system identifier.
Set the PATH
environment variable to include the Infrastructure’s $ORACLE_HOME/bin directory
.
Important: Specify the path of the working Oracle home as the first entry in thePATH environment variable if there are several Oracle homes installed on the machine. Also, ensure that the full paths of the executables you use are specified.
|
Enable volume management software and mount the file system (if necessary).
Enable the virtual IP address.
Start the metadata repository listener.
$ORACLE_HOME/bin/lsnrctl start
Start the metadata repository.
Start OPMN and all OPMN-managed processes for each OracleAS instance locally.
If OPMN daemon is not running, start both OPMN daemon and OPMN-managed processes:
$ORACLE_HOME/opmn/bin/opmnctl startall
If OPMN daemon is running, start all OPMN-managed processes collectively:
$ORACLE_HOME/opmn/bin/opmnctl startproc
Alternatively, to individually start up OPMN-managed processes:
$ORACLE_HOME/opmn/bin/opmnctl startproc ias-component=HTTP_Server
Start Oracle Internet Directory:
$ORACLE_HOME/opmn/bin/opmnctl startproc ias-component=OID
Start the Delegated Administration Services instance:
$ORACLE_HOME/opmn/bin/opmnctl startproc ias-component=OC4J instancename=OC4J_SECURITY
Check the status of the OPMN-managed processes using the following command:
$ORACLE_HOME/opmn/bin/opmnctl status
Start the Application Server Console. Use one of the following commands:
$ORACLE_HOME/bin/emctl start iasconsole
Or:
$ORACLE_HOME/bin/emctl startifdown iasconsole
Use the following steps to stop the OracleAS Infrastructure in an OracleAS Cold Failover Cluster:
Set the ORACLE_HOME
environment variable to the Infrastructure’s Oracle home.
Set the ORACLE_SID
environment variable to the metdata repository’s system identifier.
Stop OPMN and all OPMN-managed processes for each OracleAS instance locally.
To shutdown the OPMN daemon and all OPMN-managed processes:
$ORACLE_HOME/opmn/bin/opmnctl stopall
To shutdown all OPMN-managed processes but leave the OPMN daemon running:
$ORACLE_HOME/opmn/bin/opmnctl stopproc
Alternatively, to individually shutdown all OPMN-managed processes:
Stop the Delegated Administration Services instance:
$ORACLE_HOME/opmn/bin/opmnctl stopproc ias-component=OC4J instancename=OC4J_SECURITY
Stop Oracle Internet Directory:
$ORACLE_HOME/opmn/bin/opmnctl stopproc ias-component=OID
$ORACLE_HOME/opmn/bin/opmnctl stopproc ias-component=HTTP_Server
Stop the metadata repository
Stop the metadata repository listener.
$ORACLE_HOME/bin/lsnrctl stop
Stop the Application Server Console.
$ORACLE_HOME/bin/emctl stop iasconsole
Disable volume management software and unmount the file system (if necessary).
Disable the virtual IP address.
Note: Check OracleMetalink (http://metalink.oracle.com) for the most current certification status of this feature or consult your Oracle sales representative before deploying this feature in a production environment. |
The instructions in this section detail the steps for starting and stopping the Infrastructure in the OracleAS Active Failover Cluster high availability solution.
For an OracleAS Active Failover Cluster-enabled Infrastructure, each node in the cluster is functionally equivalent to the other nodes. All nodes access a common repository. The database instance and the individual OracleAS processes need to be started on each node of the cluster. At any given time, the load balancer should be configured to direct traffic to only the active nodes. The order of starting up Infrastructure instances on all nodes is:
On each node, start the global services daemon:
$ORACLE_HOME/bin/gsd
Start the database instances and listeners on all nodes with the following command (can be run from any node in the cluster):
$ORACLE_HOME/bin/srvctl start -p <database_name>
The global services daemon on each node ensures that the local database processes on each node are started.
Start OPMN and all OPMN-managed processes.
If OPMN daemon is not running, start both OPMN daemon and OPMN-managed processes (following command need only be run once):
$ORACLE_HOME/opmn/bin/opmnctl @instance:<instancename_on_node1>:<instancename_on_node2> startall
Note: For more efficient startup of the OracleAS Active Failover Cluster nodes, you can configure each node’s operating system to start up the OPMN daemon whenever the node starts up. The procedures for performing this task are specific to each operating system. For example, in UNIX, the rc scripts can be configured by the system administrator for this purpose. |
If OPMN daemon is running, you can start all OPMN-managed processes on all nodes (following command need only be run once):
$ORACLE_HOME/opmn/bin/opmnctl @instance:<instancename_on_node1>:<instancename_on_node2> startproc
For example, assuming there are two nodes in the cluster:
$ORACLE_HOME/opmn/bin/opmnctl @instance:infra_node1:infra_node2 startproc
Alternatively, to individually start up OPMN-managed processes on all nodes (following commands need only be run once):
$ORACLE_HOME/opmn/bin/opmnctl @instance:<instancename_on_node1>:<instancename_on_node2> startproc ias-component=HTTP_Server
Start Oracle Internet Directory.
$ORACLE_HOME/opmn/bin/opmnctl @instance:<instancename_on_node1>:<instancename_on_node2> startproc ias-component=OID
Start the Delegated Administration Services instance:
$ORACLE_HOME/opmn/bin/opmnctl @instance:<instancename_on_node1>:<instancename_on_node2> startproc ias-component=OC4J instancename=OC4J_SECURITY
Configure the load balancer and enable traffic to the current node.
Check the status of OPMN-managed processes on all nodes using the following command:
$ORACLE_HOME/opmn/bin/opmnctl @instance:<instancename_on_node1>:<instancename_on_node2> status
Start the Application Server Console. Run one of the following commands on each node in the cluster:
$ORACLE_HOME/bin/emctl start iasconsole
Or:
$ORACLE_HOME/bin/emctl startifdown iasconsole
The OracleAS Active Failover Cluster-enabled Infrastructure provides better availability since individual Infrastructure instances can be shut down while others continue to be available. The order of shutting down an instance is:
Configure the load balancer and disable traffic to the current node.
Stop all OPMN and OPMN-managed processes.
To stop all OPMN-managed processes but leave the OPMN daemon running, run the following command once:
$ORACLE_HOME/opmn/bin/opmnctl @instance:<instancename_on_node1>:<instancename_on_node2> stopproc
To individually stop OPMN-managed processes on all cluster nodes, run the following commands once:
Stop the Delegated Administration Services instance:
$ORACLE_HOME/opmn/bin/opmnctl @instance:<instancename_on_node1:<instancename_on_node2> stopproc ias-component=OC4J instancename=OC4J_SECURITY
Stop Oracle Internet Directory:
$ORACLE_HOME/opmn/bin/opmnctl @instance:<instancename_on_node1:<instancename_on_node2> stopproc ias-component=OID
$ORACLE_HOME/opmn/bin/opmnctl @instance:<instancename_on_node1>:<instancename_on_node2> stopproc ias-component=HTTP_Server
Stop the database instances and listeners on all nodes with the following command (can be run from any node in the cluster):
$ORACLE_HOME/bin/srvctl stop -p <database_name>
The global services daemon on each node ensures that the local database processes on each node are stopped.
Stop the Application Server Console. Run the following command on each node in the cluster:
$ORACLE_HOME/bin/emctl stop iasconsole
Monitoring the OracleAS Active Failover Cluster-enabled Infrastructure is similar to monitoring any regular Infrastructure deployment. The only special consideration is monitoring the load balancer and ensuring that it is directing traffic to the active nodes. Please contact your load balancer vendor on monitoring the availability of the load balancer as well as the sanity of its configuration.
The OracleAS Active Failover Cluster-enabled Infrastructure provides continued availability under many scheduled and unplanned outages. The outages handled automatically by this solution are listed in Table 5-1 below.
Table 5-1 Outages handled automatically by OracleAS Active Failover Cluster solution
Outage Type | Outages |
---|---|
Scheduled | Node hardware and operating system maintenance
Database instance maintenance Infrastructure software maintenance Fault tolerant load balancer maintenance |
Unplanned | Node failure
Database instance failure Infrastructure process failure Fault tolerant load balancer partial failure |
For the outages in Table 5-2, there may be a small downtime. Having a disaster recovery site can mitigate but not eliminate this downtime. A standby site can be activated while the production site experiences an outage. Refer to the section "Oracle Application Server 10g Disaster Recovery Solution" for more information.
Table 5-2 Outages not handled automatically by OracleAS Active Failover Cluster solution
Outage Type | Outages |
---|---|
Scheduled | Cluster maintenance
Database maintenance |
Unplanned | Cluster failure
Data error User error Fault tolerant load balancer complete failure |
System behavior under these outages is as follows:
When a node fails:
All processes in the node are unavailable.
Middle tier and Infrastructure processes from other nodes connected to the database instance on the failed node lose their database session.
A surviving instance on another node begins instance recovery. Requests that were directed to the failed instance experience a brief interruption in service. If they are disconnected, they can retry the connections again until successful.
The processes connected to the active instance performing the recovery experience uninterrupted service. These processes may experience a lag.
If only one node is available to service requests and it fails, connections that are retrying will only succeed when at least one instance becomes available.
When a database instance fails, the following need to be performed:
The load balancer should be notified of the failure. It should stop directing non Oracle Net traffic to he node with the failed instance.
All non-database processes on the node with the failed instance should be brought down. This includes Oracle HTTP Server, OPMN, Application Server Console, and Oracle Internet Directory processes.
Middle tier and Infrastructure processes from other nodes connected to the failed instance lose their database session.
A surviving instance on another node begins instance recovery.
Middle tier requests that were directed to the failed instance experience a brief interruption in service. If they are disconnected, they can retry the connections again until successful.
The processes connected to the active instance performing the recovery experience uninterrupted service.
All new middle tier requests are directed to the surviving instance.
Restoration of resiliency post-outage involves the addition of an Oracle Application Server 10g instance to the current set of active Oracle Application Server 10g instances. The primary steps involved are:
Fix the problem that caused the outage.
Startup an Oracle database instance on the node.
Startup Oracle Application Server 10g Infrastructure processes on the node. Refer to Oracle Application Server 10g Administrator's Guide on how to start and stop the Infrastructure.
Configure the load balancer to direct traffic to the currently started node.
afcctl
)
Each node of an OracleAS Active Failover Cluster has in its file system configuration files that are part of the Infrastructure but are not stored in the OracleAS Metadata Repository. These files are likely to change as administration operations such as the following are performed on each node:
Manual changes made to these configuration files
Application Server Console-based changes to these configuration files
Association of a middle tier instance with the Infrastructure
A primary requirement for an OracleAS Active Failover Cluster deployment is that all nodes in the cluster are configured similarly. The configuration files should be similar, if not, identical.
In order to maintain a consistent Infrastructure across all nodes in the OracleAS Active Failover Cluster, a command line utility is provided to synchronize these configuration files across the nodes. This utility is called Oracle Application Server Active Failover Cluster Runtime Control Utility and can be invoked using the command afcctl. Synchronization of files using this utility should be performed at least everytime an administration change is made to Oracle Application Server.
This section describes how to download and install the afcctl utility and perform initial configuration. The overall steps are:
The afcctl utility is available with the utility CD that comes with your Oracle Application Server 10g product. The file containing the utility is <mount-point>
/utilities/ha/afcctl.zip
(where <mount-point>
is the mount point of the CD-ROM drive. This file is installed along with the Oracle Application Server 10g Backup and Recovery Tool. Hence, you need to install the latter tool first. Refer to Oracle Application Server 10g Administrator's Guide for instructions on installing this tool.
After the Oracle Application Server 10g Backup and Recovery Tool has been installed on a node, create the directory <ORACLE_HOME>
/afcctl/
on that node, and copy afcctl.zip
to the new directory. Do this for every node in the OracleAS Active Failover Cluster.
Note: You can also create a directory to storeafcctl.zip outside of ORACLE_HOME . The instructions in this section assumes you have afcctl in <ORACLE_HOME> /afcctl/ .
|
Change to the <ORACLE_HOME>
/afcctl/
directory and unzip afcctl.zip
. It should contain the following files:
afcctl afcctl.pl afcctl.jar afcctl_exclude.inp README
In UNIX , run the following command to enable execute permissions:
> chmod 755 afcctl
The afcctl utility relies on the Oracle Application Server 10g Backup & Recovery Tool being available and installed. Make sure that the latter tool has been installed and the .inp
files of the tool are accessible by the afcctl utility.
Run the afcctl utility on any node in the OracleAS Active Failover Cluster to perform the following tasks:
Synchronizing Files From a Node to Other Nodes in an OracleAS Active Failover Cluster
Listing Modified Files on a Node Since the Last Synchronization
Excluding Specific Configuration Files from Synchronization
Note: Ensure thatORACLE_HOME is set before running afcctl .
|
Immediately after installation of OracleAS and afcctl
, you should set the default configuration timestamp across the OracleAS Active Failover Cluster. This baseline timestamp marks the default configuration after installation.
After this baseline is set, the next synchronization performed using afcctl
with the sync
option synchronizes only those configuration files that have changed since the baseline and the time the afcctl sync
command is run.
To create a timestamp baseline, use the following command:
afcctl createbase -p <dbname>|-r <host1>,<host2>,...,<hostN> [-c <cp_exec>]
where:
<dbname>
is the name of the Infrastructure database
<host1>
,
<host2>
,..,
<hostN>
is a comma separated list of remote hosts in the OracleAS Active Failover Cluster
<cp_exec>
is the full path to a remote copy utility to be used to copy files from the current node to other nodes in the cluster. By default, afcctl
uses /usr/bin/rcp
, or /usr/local/bin/scp
if the former (rcp
) is not found. If neither of these are present or if you wish scp
to precede rcp
in the invocation order, use the -c
<cp_exec>
option to specify the copy utility to be used.
Note: Runningafcctl with the createbase option is highly recommended right after installation of OracleAS Active Failover Cluster software.
|
After the initial configuration baseline is set using the createbase
option, you can synchronize any configuration changes across the cluster using the sync
option. This option synchronizes changed configuration by copying only the modified configuration files from the current node to all nodes in the OracleAS Active Failover Cluster.
The command syntax for invoking a synchronization is:
afcctl sync -p <dbname>|-r <host1>,<host2>,...,<hostN> -f <filename>|<file_list_dir> [-c <cp_exec>] [-l <hostname>]
where :
<dbname>
is the name of the Infrastructure database
<host1>
,
<host2>
,..,
<hostN>
is a comma separated list of remote hosts in the OracleAS Active Failover Cluster
<filename>
is the name of the file to be synchronized
<file_list_dir>
is the name of the directory where the .inp
files reside
<cp_exec>
is the full path to a remote copy utility to be used to copy files from the current node to other nodes in the cluster. By default, afcctl
uses /usr/bin/rcp
, or /usr/local/bin/scp
if the former (rcp
) is not found. If neither of these are present or if you wish scp
to precede rcp
in the invocation order, use the -c
<cp_exec>
option to specify the copy utility to be used.
<hostname>
is the hostname of the local host at installation time of the Infrastructure
Take note of the following when using the above command line:
The -p
option can only be used if the gsd process has been started and is running. This option also requires that the Infrastructure database and its instances have been registered with the srvm repository, which is the case by default. The -p
option automatically determines the nodes of the OracleAS Active Failover Cluster deployment and propagates the necessary files to the other nodes.
The -r
option can be used in lieu of -p
to explicitly specify the hostnames of the nodes that need to be synchronized with the node the utility is run on. No other process dependencies exist in this case. Ensure that the hostname(s) specified are valid nodes of the OracleAS Active Failover Cluster deployment.
<file_list_dir>
for the -f
option is the directory where the .inp
files of the Oracle Application Server 10g Backup and Recovery Tool exist. The afcctl utility references these files for the list of configuration files that need to be synchronized.
<filename>
for the -f
option is used only when a single file needs to be synchronized across the cluster.
Before using afcctl
, set up user equivalence so that the rcp
and scp
copy utilities can be used without further authentication for the remote hosts. User equivalence is also required for the OracleAS Active Failover Cluster installation. Refer to the high availability chapter in Oracle Application Server 10g Installation Guide for instructions on how to set up user equivalence.
-l
is optional. It is used to specify the host where afcctl is run and should be specified only in cases where the local hostname during Infrastructure installation is different from the default hostname.
It is strongly recommended that the utility is installed on all nodes of the OracleAS Active Failover Cluster deployment. The utility can be invoked from any node of the cluster on which it has been installed. However, as a best practice, designate one node as an administration node and perform all administrative operations and subsequent synchronizations from it.
To find out which configuration files on a node in the OracleAS Active Failover Cluster has changed since the last synchronization, use the following command line syntax:
afcctl list -f <filename>|<file_list_dir>
where
<filename>
is the name of a file that is to be checked for any updates since the last synchronization.
<file_list_dir>
is the name of the directory where the .inp
files of the Oracle Application Server 10g Backup and Recovery Tool exist. Files in that directory which have been changed since the last synchronization are listed.
A text file containing a list of files that have changed since the last synchronization is created in the /tmp
directory. See an example in the section "Example".
The syntax above can be used on any node of the OracleAS Active Failover Cluster deployment. It displays the files that have changed, since the last synchronization, on the node it is executed on. The returned list of files can be different depending on the site. To synchronize the listed file(s) individually, the -f
<filename>
option of the afcctl sync
command can be used after determing which version of the file is the latest.
Oracle recommends that all nodes are configured similarly. If, however, some configuration files need to be different, their names can be added to the exclude file, afcctl_exclude.inp
, so that they are not synchronized across the cluster when afcctl
is run. afcctl_exclude.inp
is found in the same directory where you uncompressed afcctl.zip
.
Excluding files may be necessary in situations such as when you want to turn debugging on for only a particular node or change a configuration file temporarily to measure impact on the system. Since the files listed in the exclude file are not synchronized, any changes to them have to be propagated manually to the equivalent files on the other nodes until they are removed from the exclude file. If you do not want the exclusions to be permanent, remove the filenames after peforming synchronization.
Note: You can also include custom files to be synchronized across the OracleAS Active Failover Cluster nodes each time theafcctl utility is run. The file config.inp contains rules for this task.
|
After any administrative operation to Oracle Application Server 10g (through Application Server Console or DCM), which can change any of the configuration files, do the following on each node of the OracleAS Active Failover Cluster:
Set the ORACLE_HOME
environment variable. For example, in a Bourne shell environment, type:
$ export ORACLE_HOME=/home/oracle/test1
Invoke afcctl
with the list
option to display the configuration files that have changed on the current node.
$ afcctl list -f ./br_inp_dir Oracle Application Server Active Failover Cluster Run Time Control Utility Copyright (c) 2002, 2003 Oracle Corporation. All rights reserved. Last Sync up time was Mon Sep 8 11:09:11 2003 Check the following for list of files that have changed since last sync work/Files_to_Change_and_Copy.23123 Please look at log/afcctl.log for more information. Exiting....
The file work/Files_to_Change_and_Copy.23123
is created to contain the list of configuration files that have changed since the last synchronization.
View the created file to validate that the list of files in it are the ones you want to propagate to the other nodes in the OracleAS Active Failover Cluster. For example:
$ cat work/Files_to_Change_and_Copy.23123 /home/oracle/test1/Apache/Apache/conf/ssl.wlt/default/ewallet.p12 /home/oracle/test1/ldap/admin/oidpwdlldap1 /home/oracle/test1/ldap/admin/oidpwdrgit11
Invoke afcctl
with the sync
option to synchronize files from one node in the OracleAS Active Failover Cluster to another.
$ afcctl sync -r hasun26 -f ./backup_scripts Oracle Application Server Active Failover Cluster Run Time Control Utility Copyright (c) 2002, 2003 Oracle Corporation. All rights reserved. Files to massage & copy are listed in work/Files_to_Change_and_Copy.22339 Files to copy are listed in work/Files_to_Copy.22339 Do you want to sync up files from hasun25.us.oracle.com to hasun26.us.oracle.com (y/n) ? y Syncing up files .......................................................................................................................! Syncing completed Do you want to update the dcm repository with configuration files from "hasun25.us.oracle.com" (y/n) ? y DCM update repository started DCM update repository completed Please look at log/afcctl.log for more information. Exiting....
Note: Typingafcctl without any options displays a list of all options available.
|
Avoid making manual changes to configuration files as far as possible. Use Application Server Console or the automated configuration tools (dcmctl
) to make any changes to the Infrastructure configuration.
If you have to make changes manually, designate a node as the administration node and perform all changes on it. Then, run the afcctl tool from this node to propagate the changes to other nodes in the OracleAS Active Failover Cluster.
When synchronizing configuration files, preferably perform any administration changes and consequent synchronization between the nodes when all nodes of the OracleAS Active Failover Cluster are up. If this is not possible, remember to perform the synchronization with the down nodes after they are up again.
Use the list
option regularly on all nodes of the cluster to verify that nothing has been changed locally since the last synchronization. Reconcile any changes across the nodes, if required.
If sharing .inp
files with the Oracle Application Server 10g Backup & Recovery Tool, keep in mind that any changes to the exclude and personalization files of the tool impacts the utility as well. See the Oracle Application Server 10g Administrator's Guide