Oracle® Fusion Middleware Platform Developer's Guide for Oracle Real-Time Decisions 11g Release 1 (11.1.1) Part Number E16630-04 |
|
|
PDF · Mobi · ePub |
Oracle RTD Batch Framework is a set of components that can be used to provide batch facilities in an Inline Service. This enables the Inline Service to be used not just for processing interactive Integration Point requests, but also for running a batch of operations of any kind. Typically, a batch will read a set of input rows from a database table, flat file, or spreadsheet, process each input row in turn, and optionally write one or more rows to an output table for each input row.
The following examples describe in outline form how you can use Oracle RTD batch processing facilities in a non-interactive setting:
Create a "learning" batch to train models to learn from historical data about the effectiveness of offers previously presented to customers.
Create an "offer selection" batch which starts with a set of customers, and selects the best product to offer to each customer.
Create a "customer selection" batch which starts with a single product, and selects the best customers to whom to offer the product.
Create a batch set of e-mails where Oracle RTD selects the right content for the e-mails
Within an Inline Service, the Inline Service developer defines one or more Java classes implementing the BatchJob
interface, with one BatchJob
for each named batch that the Inline Service wishes to support. In the Inline Service, each of the BatchJob
implementations is registered with the Oracle RTD Batch framework, making the job types available to be started by an external batch administration application. External applications may start, stop, and query the status of registered batch jobs through a BatchAdminClient class provided by the Batch Framework. The Batch Console, released with Oracle RTD, is a command-line utility that enables you to perform these batch-related administrative tasks.
Note:
The following terms are referenced throughout the Oracle RTD documentation:
RTD_HOME: This is the directory into which the Oracle RTD client-side tools are installed.
RTD_RUNTIME_HOME: This is the application server specific directory in which the application server runs Oracle RTD.
For more information, see the chapter "About the Oracle RTD Run-Time Environment" in Oracle Fusion Middleware Administrator's Guide for Oracle Real-Time Decisions.
The topics in this section are the following:
This section presents an overview of the components of the batch framework architecture and shows how batch facilities can be used across cluster servers.
The following diagram shows the components of the batch framework architecture on a single Oracle RTD instance.
The main batch framework components and their functions are:
Batch Admin Client
The Batch Admin Client provides a set of Java APIs that can be used by Java client applications to manage batches registered on remote Real-Time Decision Servers. This includes starting and stopping batches, and obtaining batch status information.
Customers may create their own batch client application using the APIs provided in the Batch Admin Client.
The Batch Console is a client side command line utility that manages batches registered on remote Real-Time Decision Servers. Internally, the Batch Console uses the APIs provided by the Batch Admin Client.
Batch Manager
This is a cluster-wide singleton service, that executes client batch commands from client code from the Batch Admin Client.The Batch Manager manages each Batch Agent in the cluster.
The Batch Manager also executes commands from the Batch Console.
Batch Agent
The batch agent is the interface between a batch job and the batch framework. It is a service that registers batches with the Batch Manager when the batch-enabled Inline Service is deployed, and executes batch commands on behalf of the Batch Manager.
In a clustered environment, all the batch framework components appear in each Oracle RTD instance. However, the Batch Manager is only active in one of the instances, and that active Batch Manager controls all the Batch Admin Client and Batch Agent requests in the cluster.
The following diagram illustrates an example of the use of the batch framework in a clustered environment.
A batch client application, such as the Batch Console, communicates with the Batch Manager, by invoking batch management commands, such as to start, stop, or pause a job.
Developers using Decision Studio can create and deploy Inline Services with batches to any instance where Oracle RTD is installed, such as that on Cluster server 2.
Note:
In a clustered environment, Inline Services are deployed to all servers running the Decision Service.
The diagram shows the Batch Agent on the Cluster server 2 instance registering batches with the Batch Manager.
The Batch Manager can then run batch jobs on any instance, such as that on Cluster server 3, so long as they were previously registered.
This section presents an overview of the runtime object model required to implement batches.
In order for an Inline Service to be batch-enabled, it must contain one or more batch job Java classes implementing the BatchJob
interface, and register them with the batch framework.
Note:
The examples that appear in this section reference the CrossSell Inline Service released with Oracle RTD, which contains the batch job CrossSellSelectOffers.
This section consists of the following topics:
You start the implementation of a batch job in Decision Studio by creating a Java class that implements the BatchJob
interface.
First, you create Java packages and classes under the src
branch of the Inline Service.
The following image shows the "batch processing" Java class OfferSelectJob.java
declared in the package crosssell.batch
:
The easiest way to create the Java classes is to subclass from BatchJobBase
, provided with the batch framework.
The principal methods of a batch job are called in the following sequence when the job is started:
init()
Called once by the framework before starting the batch's processing loop.
getNextInput()
Returns the next input row to be processed by the batch.
executeRow()
The BatchJob implements this method to process the input row that was returned by getNextInput
. Generally, this is called in a different thread from getNextInput
.
flushOutputs()
Called by the framework to allow the BatchJob to flush its output table buffers.
cleanup()
Called by the framework after the batch is finished or is being stopped. Cleans up any resources allocated by the batch job, such as the result set created by its init()
method.
For full details of the methods of the BatchJob
interface, see the following Javadoc entry:
RTD_HOME\client\Batch\javadocs\com\sigmadynamics\batch\BatchJob.html
Batch Job Example
An example of a batch job, OfferSelectJob.java
, appears in the CrossSell Inline Service released with Oracle RTD. This batch job selects the best offer for a set of customers, and saves the offers to a table.
This section describes how to register the batch jobs with the Oracle RTD batch framework. You must register the Java classes that contain the batch jobs as imported Java classes, then you must explicitly register the batch jobs with the batch framework using the batchAgent.registerBatch
method.
This section consists of the following topics:
Section 16.2.2.2, "Registering the Imported Java Classes in the Inline Service"
Section 16.2.2.3, "Registering the Batch Jobs in the Inline Service"
In a batch job, the batch agent is the interface between a batch job and the batch framework. You need to register the batch job with the batch framework.
An Inline Service can locate its batch agent through a getter in the Logic tab of its Application object. For example, in a context where the Inline Service has access to a session, you can use the following command to access the BatchAgent:
BatchAgent batchAgent = session().getApp().getBatchAgent();
You must register the Java classes in the Inline Service, as follows:
Click the Application object's Advanced button.
In the Imported Java Classes pane, enter one line for each batch job class in the Inline Service, of the form:
<package>.<class>
For example:
crosssell.batch.OfferSelectJob
An inline service must register its BatchJob implementations in the Logic tab of the Application, in the Initialization Logic pane, using the batchAgent.registerBatch
API.
The Inline Service can locate its batch agent - its interface to the Batch Framework - through a getter in its Application object. Enter a line such as the following:
BatchAgent batchAgent = getBatchAgent();
followed by an invocation of batchAgent.registerBatch
for each batch job in the Inline Service.
For full details of the parameters for batchAgent.registerBatch
, see the following Javadoc entry:
RTD_HOME\client\Batch\javadocs\com\sigmadynamics\batch\BatchAgent.html
In summary form, the parameters for batchAgent.registerBatch
are as follows:
batchName: A short name used to register the batch class in the cluster. It should be unique across the cluster.
batchJobClass: The fully qualified name of the batch's BatchJob implementation class.
description: If non-null, a string describing the purpose of the batch.
parameterDescriptions: An optional set of properties describing the parameters supported by the batch.
parameterDefaults: An optional set of properties providing the default values for parameters supported by the batch.
For example, to register the following:
The batch CrossSellSelectOffers
that uses the class crosssell.batch.OfferSelectJob
enter the following in the Initialization Logic for the Application:
BatchAgent batchAgent = getBatchAgent(); batchAgent.registerBatch("CrossSellSelectOffers", "crosssell.batch.OfferSelectJob", OfferSelectJob.description, OfferSelectJob.paramDescriptions, OfferSelectJob.paramDefaults);
The main way to administer batch jobs is though the command-line Batch Console utility, for example, to start, stop, and query the statuses of batches.
This utility uses the BatchAdminClient
Java interface. The BatchAdminClient
Java interface also provides methods for starting and managing batches for use by external programs.
This section contains the following topics:
The BatchAdminClient
Java interface provides methods for starting and managing batches for use by external programs.
Table 16-1 lists the methods for the BatchAdminClient
interface.
Table 16-1 BatchAdminClient Methods
Return Type | Description |
---|---|
int |
clearBatchStatuses() Removes batch status information for all batches that have completed. |
int |
clearBatchStatuses(int numToKeep) Removes batch status information for the oldest batches that have completed. |
int |
clearBatchStatuses(java.lang.String batchName) Removes batch status information for all batches that have completed and have the specified batch name. |
int |
clearBatchStatuses(java.lang.String batchName, int numToKeep) Removes batch status information for all batches that have completed and have the specified batch name. |
BatchStatusBrief[] |
getActiveBatches() Returns an ordered list, possibly empty, of brief status information for all batch jobs currently running, paused, or waiting to run. |
java.lang.String |
getBatchDescription(java.lang.String batchName) Returns a string, possibly empty, describing the purpose of the batch. |
java.lang.String[] |
getBatchNames() Gets a list of batches registered with the batch framework. |
java.util.Properties |
getBatchParameterDefaults(java.lang.String batchName) Gets properties containing the default values of the startup parameters supported by the batch. |
java.util.Properties |
getBatchParameterDescriptions(java.lang.String batchName) Gets properties describing the parameters supported by the batch. |
BatchStatusBrief[] |
getJobHistory() Returns an ordered list, possibly empty, of brief status information for all batch jobs whose status information is still retained by the batch manager -- those descriptions that have not been discarded by clearBatchStatuses. |
BatchStatusBrief[] |
getJobHistory(int maxToShow) Returns an ordered list, possibly empty, of brief status information for all batch jobs whose status information is still retained by the batch manager -- those descriptions that have not been discarded by clearBatchStatuses. |
BatchStatus |
getStatus(java.lang.String batchID) Returns the status of a batch identified by the batchID that was returned when it was submitted by a call to startBatch(). |
void |
pauseBatch(java.lang.String batchID) Stops a batch and does not clean up its resources, so it can be resumed. |
void |
restartBatch(java.lang.String batchID) Restarts a stopped batch. |
void |
resumeBatch(java.lang.String batchID) Continues a paused batch. |
java.lang.String |
startBatch(java.lang.String batchName) Starts a batch in the default concurrency group with default start parameters. |
java.lang.String |
startBatch(java.lang.String batchName, BatchRequest startParameters) Starts a batch in the default concurrency group with the supplied start parameters. |
java.lang.String |
startBatch(java.lang.String batchName, java.lang.String concurrencyGroup) Starts a batch in the specified concurrency group using default start parameters. |
java.lang.String |
startBatch(java.lang.String batchName, java.lang.String concurrencyGroup, BatchRequest startParameters) Starts a batch in the specified concurrency group using the supplied start parameters. |
void |
stopBatch(java.lang.String batchID) Stops a batch and cleans up its resources by calling BatchJob.cleanup(). |
void |
stopBatch(java.lang.String batchID, boolean discardSandboxes) Stops a batch, cleans up its resources (by calling BatchJob.cleanup()), and optionally discards any learning data and output table records generated by the batch since its last checkpoint. |
For full details of the BatchAdminClient
interface, see the following Javadoc entry:
RTD_HOME\client\Batch\javadocs\com\sigmadynamics\batch\client\BatchAdminClient.html
The Batch Console is a command-line utility, batch-console.jar
. Use the Batch Console to start, stop, and query the status of batches.
To start the Batch Console, run the following commands:
cd
BATCH_HOME
Typically, BATCH_HOME
is C:\OracleBI\RTD\client\Batch
.
java [-Djavax.net.ssl.trustStore="
<trust_store_location>
"] -jar batch-console.jar -user
<batch_user_name>
-pw
<batch_user_password>
[-url
<RTD_server_URL>
] [-help]
Notes:
You must enter batch user name and password information. If you do not specify values for the -user
and -pw
parameters, you will be prompted for them.
<RTD_server_URL>
(default value http://localhost:8080)
is the address of the Decision Service. In a cluster, it is typically the address of the load balancer's virtual address representing the Decision Service's J2EE cluster.
Use the -Djavax.net.ssl.trustStore="
<trust_store_location>
" parameter only if SSL is used to connect to the Real-Time Decision Server, where <trust_store_location>
is the full path of the truststore file. For example, -Djavax.net.ssl.trustStore="C:\OracleBI\RTD\etc\ssl\sdtrust.store"
. In this case, <RTD_server_URL>
should look like https://localhost:8443
.
If you enter -help
, with or without other command line parameters, a usage list appears of all the Batch Console command line parameters, including -help.
To see a list of the interactive commands within Batch Console, enter ? at the command prompt:
command <requiredParam> -- [alias] Description ? -- Show this usage text help -- Show this usage text exit -- Terminate this program quit -- Terminate this program batchNames -- [bn] Show all registered Batch batchDesc <batchName> -- [bd] Show Batch Description paramDesc <batchName> -- [pd] Show a batch's Parameter Descriptions paramDef <batchName> -- [pdef] Show a batch's Parameter Default values addProp <key> <value> -- [ap] Add one Property for next job start removeProp <key> -- [rp] Remove one startup Property showAddedProps -- [sap] Show all Added startup Properties removeAddedProps -- [rap] Remove all Added startup Properties startJob <batchName> -- [start] Start a batch job, returning a jobID startInGroup <batchName> <groupName> -- [startg] Start a batch job in a Concurrency Group status <jobID> -- [sts] Show a job's detailed runtime Status activeJobs -- [jobs] Show brief status of all running, paused, waiting jobs jobHistory -- [hist] Show brief status of all submitted jobs stopJob <jobID> -- [stop] Stop a job, without abililty to resume stopJobDiscardSandbox <jobID> -- [stopds] Stop a job, without abililty to resume, discard learning sandboxes restartJob <jobID> -- [restart] Restart a batch job pauseJob <jobID> -- [pause] Pause a job resumeJob <jobID> -- [resume] Resume a paused job discardStatusAll -- [dsa] Discard status information for all non-active jobs discardStatusOld <numToKeep> -- [dso] Discard Status for oldest non-active jobs discardStatusName <batchName> -- [dsn] Discard Status for non-active jobs of named batch discardStatusNameOld <batchName> <numToKeep> -- [dsno] Discard Status for oldest non-active jobs of named batch
The rest of this section contains the following topics:
To get a list of registered batches, enter bn
or batchNames
.
To get the default parameter values for a batch, enter paramDef <batchName>
or pdef <batchName>
.
For example, your batch may have the parameter values:
sqlCustomers
- to select the customers to process
rowsBetweenStatusUpdates
- to control how often to update the batch status
The default values for these parameters could be as follows:
sqlCustomers = SELECT Id FROM Customers WHERE Id < 300
rowsBetweenStatusUpdates = 1000
To supply parameter values for the next batch invocation, use the addProp
command, or its alias, ap
.
For example, you can override the sqlCustomers
parameter to include all customers, with the following command:
ap sqlCustomers SELECT Id FROM Customers
And if you want to update the batch status after every 1500 customers are processed, enter the following command:
ap rowsBetweenStatusUpdates 1500
You can view all such explicitly added parameters with the showAddedProps
command, or its alias, sap
.
For example, if you used the preceding ap
commands, the sap
output would be:
Property Value -------- ----- rowsBetweenStatusUpdates 1500 sqlCustomers SELECT Id FROM Customers
To start a batch, use the startJob
command, or its alias, start
.
The output will be similar to the following:
batchID=batch-2
The returned batchID
, also known as a job-ID
, identifies this job instance. You can use it to query the status of the job.
To see the runtime status of the job, pass its batchID value to the status
command, or to its alias, sts
.
sts batch-2
The out put will be similar to the following:
ID Name State Rows Errors Restarts -- ---- ----- ---- ------ -------- batch-2 MyBatchJob1 Running 4,500 0 0 SubmitDateTime WaitTime RunTime Group Server -------------- -------- ------ ----- ------ 06/24/08-10:25:37 0m, 0s 0m, 0s Default RTDServer
If you run the status
command later, you can see that the job finished without errors, after processing 50,000 customers in 9 minutes and 44 seconds:
ID Name State Rows Errors Restarts -- ---- ----- ---- ------ -------- batch-2 MyBatchJob1 Finished 50,000 0 0 SubmitDateTime WaitTime RunTime Group Server -------------- -------- ------ ----- ------ 06/24/08-10:25:37 0m, 0s 9m, 44s Default RTDServer
When jobs are submitted to be started they are assigned to a concurrency group. If not specified, the default concurrency group is assigned, named Default.Jobs in the same concurrency group run sequentially, one at a time, in the sequence that they were submitted to be started. So if you start a second job before the first finishes, the second job will wait to start until after the first one finishes.This section shows the starting of the batch MyBatchJob
1, and then the starting of two other batches, MyBatchJob2
, and MyBatchJob3
.Before starting MyBatchJob1
, use the sap
command to verify the console has the parameter values set for the two parameters, rowsBetweenStatusUpdates
, and sqlCustomers
.
After starting MyBatchJob1
, clear these parameters using the removeAddedProps
command (rap
), so that the next two jobs will use default values for all their parameters.The jobs
command shows a brief status of all running and waiting jobs. It shows the first job running, and the other two waiting.
command: batchNames MyBatchJob1 MyBatchJob2 MyBatchJob3 MyBatchJob4 MyBatchJob5 command: showAddedProps Property Value -------- ----- rowsBetweenStatusUpdates 1500 sqlCustomers SELECT Id FROM Customers command: start MyBatchJob1 batchID=batch-3 command: removeAddedProps command: start MyBatchJob2 batchID=batch-4 command: start MyBatchJob3 batchID=batch-5 command: jobs ID Name State Group Server -- ---- ----- ----- ------ batch-3 MyBatchJob1 Running Default RTDServer batch-4 MyBatchJob2 Waiting Default none batch-5 MyBatchJob3 Waiting Default none
The startInGroup
command, or its alias, startg
, may be used to assign a job to a specific concurrency group. Starting two jobs in different groups allows them to run at the same time.
For example:
command: startg MyBatchJob4 myGroup1 batchID=batch-6 command: startg MyBatchJob5 myGroup2 batchID=batch-7 command: jobs ID Name State Group Server -- ---- ----- ----- ------ batch-6 MyBatchJob4 Running myGroup1 RTDServer batch-7 MyBatchJob5 Running myGroup2 RTDServer
Note:
Jobs assigned to the same concurrency group may run on different servers, but the jobs cannot run concurrently. Only jobs in different groups are allowed to run concurrently.