Oracle® Fusion Middleware Release Notes 11g Release 1 (11.1.1) for HP-UX Itanium Part Number E14773-11 |
|
|
View PDF |
This chapter describes issues associated with Oracle Fusion Middleware high availability and enterprise deployment. It includes the following topics:
This section describes general issue and workarounds. It includes the following topic:
Section 6.1.2, "mod_wl Not Supported for OHS Routing to Managed Server Cluster"
Section 6.1.4, "SOA Composer Generates Error During Failover"
Section 6.1.5, "Accessing Web Services Policies Page in Cold Failover Environment"
Section 6.1.6, "Considerations for Oracle Identity Federation HA in SSL mode"
It is highly recommended that the application tier in the SOA Enterprise Deployment topology and the WebCenter Enterprise Deployment topology is protected against anonymous RMI connections. To prevent RMI access to the middle tier from outside the subset configured, follow the steps in "Configure connection filtering" in the Oracle WebLogic Server Administration Console Online Help. Execute all of the steps, except as noted in the following:
Do not execute the substep for configuring the default connection filter. Execute the substep for configuring a custom connection filter.
In the Connection Filter Rules field, add the rules that will allow all protocol access to servers from the middle tier subnet while allowing only http(s) access from outside the subnet, as shown in the following example:
nnn.nnn.0.0/nnn.nnn.0.0 * * allow 0.0.0.0/0 * * allow t3 t3s
Oracle Fusion Middleware supports only mod_wls_ohs
and does not support mod_wl
for Oracle HTTP Server routing to a cluster of managed servers.
For Oracle Fusion Middleware high availability deployments, Oracle strongly recommends following only the configuration procedures documented in the Oracle Fusion Middleware High Availability Guide and the Oracle Fusion Middleware Enterprise Deployment Guides.
During failover, if you are in a SOA Composer dialog box and the connected server is down, you will receive an error, such as Target Unreachable, 'messageData' returned null
.
To continue working in the SOA Composer, open a new browser window and navigate to the SOA Composer.
In a Cold Failover Cluster (CFC) environment, the following exception is displayed when Web Services policies page is accessed in Fusion Middleware Control:
Unable to connect to Oracle WSM Policy Manager. Cannot locate policy manager query/update service. Policy manager service look up did not find a valid service.
To avoid this, implement one the following options:
Create virtual hostname aliased SSL certificate and add to the key store.
Add "-Dweblogic.security.SSL.ignoreHostnameVerification=true" to the JAVA_OPTIONS parameter in the startWeblogic.sh or startWeblogic.cmd files
In a high availability environment with two (or more) Oracle Identity Federation servers mirroring one another and a load balancer at the front-end, there are two ways to set up SSL:
Configure SSL on the load balancer, so that the SSL connection is between the user and the load balancer. In that case, the keystore/certificate used by the load balancer has a CN referencing the address of the load balancer.
The communication between the load balancer and the WLS/Oracle Identity Federation can be clear or SSL (and in the latter case, Oracle WebLogic Server can use any keystore/certificates, as long as these are trusted by the load balancer).
SSL is configured on the Oracle Identity Federation servers, so that the SSL connection is between the user and the Oracle Identity Federation server. In this case, the CN of the keystore/certificate from the Oracle WebLogic Server/Oracle Identity Federation installation needs to reference the address of the load balancer, as the user will connect using the hostname of the load balancer, and the Certificate CN needs to match the load balancer's address.
In short, the keystore/certificate of the SSL endpoint connected to the user (load balancer or Oracle WebLogic Server/Oracle Identity Federation) needs to have its CN set to the hostname of the load balancer, since it is the address that the user will use to connect to Oracle Identity Federation.
This section describes configuration issues and their workarounds. It includes the following topics:
Section 6.2.1, "jca.retry.count Doubled in a Clustered Environment"
Section 6.2.3, "WebLogic Server Restart after Abrupt Machine Failure"
Section 6.2.5, "Fusion Middleware Control May Display Incorrect Status"
Section 6.2.6, "Accumulated BPEL Instances Cause Performance Decrease"
Section 6.2.7, "Extra Message Enqueue when One a Cluster Server is Brought Down and Back Up"
Section 6.2.8, "Duplicate Unrecoverable Human Workflow Instance Created with Oracle RAC Failover"
Section 6.2.10, "No High Availability Support for SOA B2B TCP/IP"
Section 6.2.11, "WebLogic Administration Server on Machines with Multiple Network Cards"
Section 6.2.12, "Additional Parameters for SOA and Oracle RAC Data Sources"
Section 6.2.13, "Message Sequencing and MLLP Not Supported in Oracle B2B HA Environments"
Section 6.2.14, "Access Control Exception After Expanding Cluster Against an Extended Domain"
In a clustered environment, each node maintains its own in-memory Hasmap for inbound retry. The jca.retry.count
property is specified as 3 for the inbound retry feature. However, each node tries three times. As a result, the total retry count becomes 6 if the clustered environment has two nodes.
All the machines in a cluster must be in the same time zone. WAN clusters are not supported by Oracle Fusion Middleware high availability. Even machines in the same time zone may have issues when started by command line. Oracle recommends using Node Manager to start the servers.
If Oracle WebLogic Server does not restart after abrupt machine failure when JMS messages and transaction logs are stored on NFS mounted directory, the following errors may appear in the server log files:
<MMM dd, yyyy hh:mm:ss a z> <Error> <Store> <BEA-280061> <The persistent store "_WLS_server_soa1" could not be deployed: weblogic.store.PersistentStoreException: java.io.IOException: [Store:280021]There was an error while opening the file store file "_WLS_SERVER_SOA1000000.DAT" weblogic.store.PersistentStoreException: java.io.IOException: [Store:280021]There was an error while opening the file store file "_WLS_SERVER_SOA1000000.DAT" at weblogic.store.io.file.Heap.open(Heap.java:168) at weblogic.store.io.file.FileStoreIO.open(FileStoreIO.java:88)
If an of abrupt machine failure occurs, WebLogic Server restart or whole server migration may fail if the transaction logs or JMS persistence store directory is mounted using NFS. WebLogic Server maintains locks on files used for storing JMS data and transaction logs to protect from potential data corruption if two instances of the same WebLogic Server are accidently started. NFS protocol is stateless, and the storage device does not become aware of machine failure, therefore, the locks are not released by the storage device. As a result, after abrupt machine failure, followed by a restart, any subsequent attempt by WebLogic Server to acquire locks on the previously locked files may fail. Refer to your storage vendor documentation for additional information on the locking of files stored in NFS mounted directories on the storage device.
Use one of the following two solutions to unlock the logs and data files.
Solution 1
Manually unlock the logs and JMS data files and start the servers by creating a copy of the locked persistence store file and using the copy for subsequent operations. To create a copy of the locked persistence store file, rename the file, and then copy it back to its original name. The following sample steps assume that transaction logs are stored in the /shared/tlogs
directory and JMS data is stored in the /shared/jms
directory.
cd /shared/tlogs mv _WLS_SOA_SERVER1000000.DAT _WLS_SOA_SERVER1000000.DAT.old cp _WLS_SOA_SERVER1000000.DAT.old _WLS_SOA_SERVER1000000.DAT cd /shared/jms mv SOAJMSFILESTORE_AUTO_1000000.DAT SOAJMSFILESTORE_AUTO_1000000.DAT.old cp SOAJMSFILESTORE_AUTO_1000000.DAT.old SOAJMSFILESTORE_AUTO_1000000.DAT mv UMSJMSFILESTORE_AUTO_1000000.DAT UMSJMSFILESTORE_AUTO_1000000.DAT.old cp UMSJMSFILESTORE_AUTO_1000000.DAT.old UMSJMSFILESTORE_AUTO_1000000.DAT
With this solution, the WebLogic file locking mechanism continues to provide protection from any accidental data corruption if multiple instances of the same servers were accidently started. However, the servers must be restarted manually after abrupt machine failures. File stores will create multiple consecutively numbered .DAT files when they are used to store large amounts of data. All files may need to be copied and renamed when this occurs.
Solution 2
Disable WebLogic file locking by disabling the native I/O wlfileio2 driver. The following sample steps move the shared object for the driver to a backup location, effectively removing it.
cd WL_HOME/server/native/platform/cpu_arch mv libwlfileio2.so /shared/backup
With this solution, since the WebLogic locking is disabled, automated server restarts and failovers succeed. In addition, this may result in performance degradations. Be very cautious when using this solution. Always configure the database based leasing option, which enforces additional locking mechanism using database tables, and prevents automated restart of more than one instance of same WebLogic Server. Additional procedural precautions must be implemented to avoid any human error and ensure that one and only one instance of a server is manually started at any given point of time. Similarly, extra precautions must be taken to ensure that no two domains have a store with the same name that references the same directory.
Cookie Persistence on the load balancer is not required for an Oracle Portal active-active setup. Any inadvertent setting of cookie Persistence to 'active cookie insert' on certain hardware load balancers for Portal deployments on Windows results in intermittent timeouts while accessing Oracle Portal.
In some instances, Oracle WebLogic Fusion Middleware Control may display the incorrect status of a component immediately after the component has been restarted or failed over.
In a scaled out clustered environment, if a large number of BPEL instances are accumulated in the database, it causes the database's performance to decrease, and the following error is generated: MANY THREADS STUCK FOR 600+ SECONDS.
To avoid this error, remove old BPEL instances from the database.
In a non-XA environment, MQSeries Adapters do not guarantee the only once delivery of the messages from inbound adapters to the endpoint in case of local transaction. In this scenario, if an inbound message is published to the endpoint, and before committing the transaction, the SOA server is brought down, inbound message are rolled back and the same message is again dequeued and published to the endpoint. This creates an extra message in outbound queue.
In an XA environment, MQ Messages are actually not lost but held by Queue Manager due to an inconsistent state. To retrieve the held messages, restart the Queue Manager.
As soon as Oracle Human Workflow commits its transaction, the control passes back to BPEL, which almost instantaneously commits its transaction. Between this window, if the Oracle RAC instance goes down, on failover, the message is retried and can cause duplicate tasks. The duplicate task can show up in two ways - either a duplicate task appears in worklistapp, or an unrecoverable BPEL instance is created. This BPEL instance appears in BPEL Recovery. It is not possible to recover this BPEL instance as consumer, because this task has already completed.
The following information refers to Chapter 10, "Managing the Topology," of the Oracle Fusion Middleware Enterprise Deployment Guide for Oracle SOA Suite.
When performing a planned stop of the Administration Server's node (rebooting or shutting down the Admin Server's machine), it may occur that the OS NFS service is disabled before the Administration Server itself is stopped. This (depending on the configuration of services at the OS level) can cause the detection of missing files in the Administration Server's domain directory and trigger their deletion in the domain directories in other nodes. This can result in the framework deleting some of the files under domain_dir/fmwconfig/
. This behavior is typically not observed for unplanned downtimes, such as machine panic, power loss, or machine crash. To avoid this behavior, shutdown the Administration Server before performing reboots or, alternatively, use the appropriate OS configuration to set the order of services in such a way that NFS service is disabled with later precedence than the Administration Server's process. See your OS administration documentation for the corresponding required configuration for the services' order.
High availability failover support is not available for SOA B2B TCP/IP protocol. This effects primarily deployments using HL7 over MLLP. For inbound communication in a clustered environment, all B2B servers are active and the address exposed for inbound traffic is a load balancer virtual server. Also, in an outage scenario where an active managed server is no longer available, the persistent TCP/IP connection is lost and the client is expected to reestablish the connection.
When installing Oracle WebLogic Server on a server with multiple network cards, always specify a Listen Address for the Administration Server. The address used should be the DNS Name/IP Address of the network card you wish to use for Administration Server communication.
To set the Listen Address:
In the Oracle WebLogic Server Administration Console, select Environment, and then Servers from the domain structure menu.
Click the Administration Server.
Click Lock and Edit from the Change Center to allow editing.
Enter a Listen Address.
Click Save.
Click Activate Changes in the Change Center.
In some deployments of SOA with Oracle RAC, you may need to set additional parameters in addition to the out of the box configuration of the individual data sources in an Oracle RAC configuration. The additional parameters are:
Add property oracle.jdbc.ReadTimeout=300000
(300000 milliseconds) for each data source.
The actual value of the ReadTimeout
parameter may differ based on additional considerations.
If the network is not reliable, then it is difficult for a client to detect the frequent disconnections when the server is abruptly disconnected. By default, a client running on Linux takes 7200 seconds (2 hours) to sense the abrupt disconnections. This value is equal to the value of the tcp_keepalive_time
property. To configure the application to detect the disconnections faster, set the value of the tcp_keepalive_time
, tcp_keepalive_interval
, and tcp_keepalive_probes
properties to a lower value at the operating system level.
Note:
Setting a low value for thetcp_keepalive_interval
property leads to frequent probe packets on the network, which can make the system slower. Therefore, the value of this property should be set appropriately based on system requirements.For example, set tcp_keepalive_time=600
at the system running the WebLogic Server managed server.
Also, you must specify the ENABLE=BROKEN
parameter in the DESCRIPTION
clause in the connection descriptor. For example:
dbc:oracle:thin:@(DESCRIPTION=(enable=broken)(ADDRESS_LIST=(ADDRESS=(PRO TOCOL=TCP)(HOST=node1-vip.mycompany.com)(PORT=1521)))(CONNECT_DATA=(SERVICE_ NAME=orcl.us.oracle.com)(INSTANCE_NAME=orcl1)))
As a result, the data source configuration appears as follows:
<url>jdbc:oracle:thin:@(DESCRIPTION=(enable=broken)(ADDRESS_LIST=(ADDRESS=(PRO TOCOL=TCP)(HOST=node1-vip.us.oracle.com)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=orcl.us.oracle.com)(INSTANCE_NAME=orcl1)))</url> <driver-name>oracle.jdbc.xa.client.OracleXADataSource</driver-name> <properties> <property> <name>oracle.jdbc.ReadTimeout</name> <value>300000</value> </property> <property> <name>user</name> <value>jmsuser</value> </property> <property> <name>oracle.net.CONNECT_TIMEOUT</name> <value>10000</value> </property> </properties>
Message sequencing and MLLP are not supported in oracle B2B high availability (HA) environments.
The Oracle Identity Federation server has been observed to fail due to access control exceptions under the following circumstances:
You create a domain with no Identity Management components on host1
.
On host2
, you extend that domain in clustered mode, select all Identity Management components, and select Create Schema
.
On host1
, you expand the cluster and select all components.
Due to a bug, the file DOMAIN_HOME
/config/fmwconfig system-jazn-data.xml
on host1
is overwritten so that the <grant>
element is removed, which causes the access control exceptions when the Oracle Identity Federation server is started.
To restore the <grant>
element, you use the WLST grantPermission
command.
On Linux, enter the following three commands at the bash prompt. Type each command on one line.
When typing the commands, replace ORACLE_COMMON_HOME
with the path to the Oracle Common Home folder, located in the Middleware Home. When prompted for information to connect to WebLogic, enter the WLS Administrator Credentials and the location of the WebLogic Administration Server.
ORACLE_COMMON_HOME/common/bin/wlst.sh ORACLE_COMMON_HOME/modules/oracle.jps_11.1.1/common/wlstscripts/grantPermissi on.py -codeBaseURL file:\${domain.home}/servers/\${weblogic.Name}/tmp/_WL_user/OIF_11.1.1.2.0/- -permClass oracle.security.jps.service.credstore.CredentialAccessPermission -permTarget context=SYSTEM,mapName=OIF,keyName=* -permActions read ORACLE_COMMON_HOME/common/bin/wlst.sh ORACLE_COMMON_HOME/modules/oracle.jps_11.1.1/common/wlstscripts/grantPermissi on.py -codeBaseURL file:\${domain.home}/servers/\${weblogic.Name}/tmp/_WL_user/OIF_11.1.1.2.0/- -permClass oracle.security.jps.service.credstore.CredentialAccessPermission -permTarget credstoressp.credstore -permActions read ORACLE_COMMON_HOME/common/bin/wlst.sh ORACLE_COMMON_HOME/modules/oracle.jps_11.1.1/common/wlstscripts/grantPermissi on.py -codeBaseURL file:\${domain.home}/servers/\${weblogic.Name}/tmp/_WL_user/OIF_11.1.1.2.0/- -permClass oracle.security.jps.service.credstore.CredentialAccessPermission -permTarget credstoressp.credstore.OIF.* -permActions read
On Windows, enter the following three commands at the command prompt. Type each command on one line.
When typing the commands, replace ORACLE_COMMON_HOME
with the path to the Oracle Common Home folder, located in the Middleware Home. When prompted for information to connect to WebLogic, enter the WLS Administrator Credentials and the location of the WebLogic Administration Server.
ORACLE_COMMON_HOME\common\bin\wlst.cmd ORACLE_COMMON_HOME\modules\oracle.jps_11.1.1\common\wlstscripts\grantPermiss ion.py -codeBaseURL file:${domain.home}/servers/\${weblogic.Name}/tmp/_WL_user/OIF_11.1.1.2.0/- -permClass oracle.security.jps.service.credstore.CredentialAccessPermission -permTarget context=SYSTEM,mapName=OIF,keyName=* -permActions read ORACLE_COMMON_HOME\common\bin\wlst.cmd ORACLE_COMMON_HOME\modules\oracle.jps_11.1.1\common\wlstscripts\grantPermiss ion.py -codeBaseURL file:${domain.home}/servers/${weblogic.Name}/tmp/_WL_user/OIF_11.1.1.2.0/- -permClass oracle.security.jps.service.credstore.CredentialAccessPermission -permTarget credstoressp.credstore -permActions read ORACLE_COMMON_HOME\common\bin\wlst.cmd ORACLE_COMMON_HOME\modules\oracle.jps_11.1.1\common\wlstscripts\grantPermiss ion.py -codeBaseURL file:${domain.home}/servers/${weblogic.Name}/tmp/_WL_user/OIF_11.1.1.2.0/- -permClass oracle.security.jps.service.credstore.CredentialAccessPermission -permTarget credstoressp.credstore.OIF.* -permActions read
This section describes documentation errata. It includes the following topic:
This section contains Documentation Errata for Oracle Fusion Middleware High Availability Guide.
Several manuals in the Oracle Fusion Middleware 11g documentation set have information on Oracle Fusion Middleware system requirements, prerequisites, specifications, and certification information.
The latest information on Oracle Fusion Middleware system requirements, prerequisites, specifications, and certification information can be found in the following documents on Oracle Technology Network:
http://www.oracle.com/technology/software/products/ias/files/fusion_certification.html
This document contains information related to hardware and software requirements, minimum disk space and memory requirements, and required system libraries, packages, or patches.
Oracle Fusion Middleware Certification information at:
http://www.oracle.com/technology/software/products/ias/files/fusion_certification.html
This document contains information related to supported installation types, platforms, operating systems, databases, JDKs, and third-party products.
In step 12, of section 10.2.3.7.2, "Transforming Oracle Reports for Cold Failover Clusters," in the Oracle Fusion Middleware High Availability Guide, the following directory path is incorrect:
DOMAIN_HOME/config/fmwconfig/servers/WLS_DISCO/applications/discoverer_11.1.1.2.0/configuration
The correct directory path is as follows:
DOMAIN_HOME/config/fmwconfig/servers/WLS_REPORTS/applications/reports_11.1.1.2.0/configuration
This section contains Documentation Errata for Oracle Fusion Middleware Enterprise Deployment Guide for Oracle WebCenter.
In Section 8.1, "Configuring the Discussion Forum Connection" of the Oracle Fusion Middleware Enterprise Deployment Guide for Oracle WebCenter, the link to section 8.1.3, "Creating a Discussions Server Connection for WebCenter From EM" is missing.
In section 6.14, "Converting Discussions Forum from Multicast to Unicast" of the Oracle Fusion Middleware Enterprise Deployment Guide for Oracle WebCenter, the following information is missing from Step 3:
Step 3: Repeat steps 1 and 2 for WLS_Services2, swapping WCHost1 for WCHost2, and WCHost2 for WCHost1 as follows:
-Dtangosol.coherence.wka1=WCHost2 -Dtangosol.coherence.wka2=WCHost1 -Dtangosol.coherence.localhost=WCHost2 -Dtangosol.coherence.wka1.port=8089 -Dtangosol.coherence.wka2.port=8089
For additional Discussions Server connection properties associated with the procedure in Section 8.1.3 "Creating a Discussions Server Connection for WebCenter From EM" of the Oracle Fusion Middleware Enterprise Deployment Guide for Oracle WebCenter, refer to section 12.3.1, "Registering Discussions Servers Using Fusion Middleware Control," in the Oracle Fusion Middleware Administrator's Guide for Oracle WebCenter.