Oracle® Fusion Middleware Solution Guide for Oracle TopLink 11g Release 1 (11.1.1) Part Number E25034-01 |
|
|
PDF · Mobi · ePub |
This chapter describes TopLink's performance features and how to monitor and optimize TopLink-enabled applications.
This chapter contains the following sections:
Toplink includes a number of performance features that make it the industry's best performing and most scalable JPA implementation. These features include:
The TopLink cache is an in-memory repository that stores recently read or written objects based on class and primary key values. The cache helps improve performance by holding recently read or written objects and accessing them in-memory to minimize database access.
Caching allows you to:
Set how long the cache lives and the time of day, a process called cache invalidation.
Configure cache types (Weak, Soft, SoftCache, HardCache, Full) on a per entity basis.
Configure cache size on a per entity basis.
Coordinate clustered caches.
TopLink defines these entity caching annotations:
@Cache
@TimeOfDay
@ExistenceChecking
TopLink also provides a number of persistence unit properties that you can specify to configure the TopLink cache (see "How to Use the Persistence Unit Properties for Caching" in the EclipseLink online documentation, at http://wiki.eclipse.org/Using_EclipseLink_JPA_Extensions_%28ELUG%29#How_to_Use_the_Persistence_Unit_Properties_for_Caching
). These properties might compliment or provide an alternative to the usage of annotations.
TopLink uses identity maps to cache objects in order to enhance performance, as well as maintain object identity. You can control the cache and its behavior by using the @Cache
annotation in your entity classes. Example 9-1 shows how to implement this annotation.
Example 9-1 Using the @Cache Annotation
@Entity @Table(name="EMPLOYEE") @Cache ( type=CacheType.WEAK, isolated=false, expiry=600000, alwaysRefresh=true, disableHits=true, coordinationType=INVALIDATE_CHANGED_OBJECTS ) public class Employee implements Serializable { ... }
For more information on object caching and using the @Cache
annotation, see "Using EclipseLink JPA Extensions for Entity Caching" in the EclipseLink online documentation, at:
The scope of a query, the amount of data returned, and how that data is returned can all affect the performance of a TopLink-enabled application. Toplink's query mechanisms enhance query performance by providing these features:
This section describes how these features improve performance.
TopLink uses the eclipselink.read-only
hint, QueryHint
(@QueryHint
) to retrieve read-only results back from a query. On nontransactional read operations, where the requested entity types are stored in the shared cache, you can request that the shared instance be returned instead of a detached copy.
For more information on read-only queries, see the documentation for the read-only hint at:
http://wiki.eclipse.org/Using_EclipseLink_JPA_Extensions_%28ELUG%29#Read_Only
Join Fetching enhances performance by enabling the joining and reading of the related objects in the same query as the source object. Enable Join Fetching by using the @JoinFetch
annotation, as shown in Example 9-2. This example shows how the @JoinFetch
annotation specifies the Employee
field managedEmployees
.
Example 9-2 Enabling JoinFetching
@Entity
public class Employee implements Serializable {
...
@OneToMany(cascade=ALL, mappedBy="owner")
@JoinFetch(value=OUTER)
public Collection<Employee> getManagedEmployees() {
return managedEmployees;
}
...
}
For more details on Join Fetching, see "How to Use the @JoinFetch Annotation" in the EclipseLink online documentation at:
The eclipselink.batch
hint supplies TopLink with batching information so subsequent queries of related objects can be optimized in batches instead of being retrieved one-by-one or in one large joined read. Batch reading is more efficient than joining because it avoids reading duplicate data. Batching is only allowed on queries that have a single object in their select clause.
If you have large queries that return a large number of objects you can improve performance by reducing the number database hits required to satisfy the selection criteria. To do this, use the The eclipselink.jdbc.fetch-size
hint. This hint specifies the number of rows that should be fetched from the database when more rows are required (depending on the JDBC driver support level). Most JDBC drivers default to a fetch size of 10, so if you are reading 1000 objects, increasing the fetch size to 256 can significantly reduce the time required to fetch the query's results. The optimal fetch size is not always obvious. Usually, a fetch size of one half or one quarter of the total expected result size is optimal. Note that if you are unsure of the result set size, incorrectly setting a fetch size too large or too small can decrease performance.
Slow paging can result in significant application overhead; however, TopLink includes a variety of solutions for improving paging results; for example, you can:
Configure the first and maximum number of rows to retrieve when executing a query.
Perform a query on the database for all of the ID values that match the criteria and then use these values to retrieve specific sets.
Configure TopLink to return a ScrollableCursor
object from a query by using query hints. This returns a database cursor on the query's result set and allows the client to scroll through the results page by page.
For details on improving paging performance, see "How to use EclipseLink Pagination" in the EclipseLink online documentation, at:
http://wiki.eclipse.org/EclipseLink/Examples/JPA/Pagination#How_to_use_EclipseLink_Pagination
TopLink uses a shared cache mechanism that is scoped to the entire persistence unit. When operations are completed in a particular persistence context, the results are merged back into the shared cache so that other persistence contexts can use them. This happens regardless of whether the entity manager and persistence context are created in Java SE or Java EE. Any entity persisted or removed using the entity manager will always be kept consistent with the cache.
You can specify how the query should interact with the TopLink cache by using the eclipselink.cache-usage
hint. For more information, see "Cache Usage" in the EclipseLink online documentation, at:
http://wiki.eclipse.org/Using_EclipseLink_JPA_Extensions_%28ELUG%29#Cache_Usage
Mapping performance is enhanced by these features:
This section describes these features.
By default, when TopLink retrieves a persistent object, it retrieves all of the dependent objects to which it refers. When you configure indirection (also known as lazy loading, lazy reading, and just-in-time reading) for an attribute mapped with a relationship mapping, TopLink uses an indirection object as a place holder for the referenced object. TopLink defers reading the dependent object until you access that specific attribute. This can result in a significant performance improvement, especially if the application is interested only in the contents of the retrieved object, rather than the objects to which it is related.
TopLink supports a variety of types of indirection, including: value holder indirection, transparent indirect container indirection, and proxy indirection.
When you declare a class read-only, clones of that class are neither created nor merged greatly improving performance. You can declare a class as read-only within the context of a unit of work by using the addReadOnlyClass()
method.
To configure a read-only class for a single unit of work, specify that class as the argument to addReadOnlyClass()
:
myUnitofWork.addReadOnlyClass(B.class);
To configure multiple classes as read-only, add them to a vector and specify that vector as the argument to addReadOnlyClass()
:
myUnitOfWork.addReadOnlyClasses(myVectorOfClasses);
For more information on using read-only objects to enhance performance, see "Declaring Read-Only Classes" in the EclipseLink online documentation, at:
http://wiki.eclipse.org/Using_Advanced_Unit_of_Work_API_%28ELUG%29#Declaring_Read-Only_Classes
Weaving is a technique of manipulating the byte-code of compiled Java classes. The TopLink JPA persistence provider uses weaving to enhance both JPA entities and Plain Old Java Object (POJO) classes for such things as lazy loading, change tracking, fetch groups, and internal optimizations.Weaving can be performed either dynamically at runtime, when entities are loaded, or statically at compile time by post-processing the entity .class
files. By default, TopLink uses dynamic weaving whenever possible. This includes inside an Java EE 5/6 application server and in Java SE when the TopLink agent is configured. Dynamic weaving is recommended as it is easy to configure and does not require any changes to a project's build process
For details on how to use weaving to enhance application performance, see "Weaving" in the EclipseLink online documentation, at:
http://wiki.eclipse.org/EclipseLink/UserGuide/JPA/Advanced_JPA_Development/Performance/Weaving
To optimize performance during data transactions, use change tracking,. Change tracking allows you to tune the way TopLink detects changes that occur during a transaction. You should choose the strategy based on the usage and data modification patterns of the entity type as different types may have different access patterns and hence different settings, and so on.TopLink offers
Enable change tracking by using the @ChangeTracking
annotation, as shown in Example 9-3.
Example 9-3 Enabling Change Tracking
@Entity
@Table(name="EMPLOYEE")
@ChangeTracking(OBJECT) (
public class Employee implements Serializable {
...
}
For more details on change tracking, see "Using EclipseLink JPA Extensions for Tracking Changes" in the EclipseLink online documentation, at:
Database performance features in TopLink include:
This section describes these features.
Establishing a connection to a data source can be time-consuming, so reusing such connections in a connection pool can improve performance. TopLink uses connection pools to manage and share the connections used by server and client sessions. This feature reduces the number of connections required and allows your application to support many clients.
By default, TopLink sessions use internal connection pools. These pools allow you to optimize the creation of read connections for applications that read data only to display it and only infrequently modify data. The also allow you to use Workbench to configure the default (write) and read connection pools and to create additional connection pools for object identity or any other purpose.
In addition to internal connection pools, you can also configure TopLink to use any of these types of connection pools:
External connection pools; you must use this type of connection pool to integrate with external transaction controller (JTA).
Default (write) and read connection pools;
Sequence connection pools; Use these types of pools when your application requires table sequencing (that is, non-native sequencing) and you are using an external transaction controller. Application-specific connection pools; These are connection pools that you can create and use for any application purpose, provided you are using internal TopLink connection pools in a session.
For more information on using connection pools with TopLink, see the following topics in the EclipseLink online documentation:
"Connection Pools", at:
http://wiki.eclipse.org/Introduction_to_Data_Access_%28ELUG%29#Connection_Pools
"Introduction to the Internal Connection Pool Creation", at:
Parameterized SQL can prevent the overall length of an SQL query from exceeding the statement length limit that your JDBC driver or database server imposes. Using parameterized SQL along with prepared statement caching can improve performance by reducing the number of times the database SQL engine parses and prepares SQL for a frequently called query
By default, TopLink enables parameterized SQL but not prepared statement caching. You should enable statement caching either in TopLink when using an internal connection pool or in the data source when using an external connection pool and want to specify a statement cache size appropriate for your application.
To enable parameterized SQL, add this line to the persistence.xml
file that is in the same path as your domain classes:
<property name="eclipselink.jdbc.bind-parameters" value="true"/>
To disable parameterized SQL, change value=
to false
.
For more information on using parameterized SQL and statement caching, see "How to Use Parameterized SQL (Parameter Binding) and Prepared Statement Caching for Optimization" in the EclipseLink online documentation, at:
Batch writing helps optimize transactions with multiple write operations. Batch writing is enabled by using the TopLink JDBC extension batch-writing. You set one of the following parameter this property into the session at deployment time:
JDBC; Use JDBC batch writing.
Buffered; Do not use either JDBC batch writing nor native platform batch writing.
Oracle-JDBC; Use both JDBC batch writing and Oracle native platform batch writing and use OracleJDBC
in your property map.
None; Disable batch writing.
For more information on batch writing, see "How to Use EclipseLink JPA Extensions for JDBC Connection Communication" in the EclipseLink online documentation, at:
TopLink provides monitoring and optimization tools, as described in Section 9.2, "Monitoring and Optimizing TopLink-Enable Applications".
The most important challenge to performance tuning is knowing what to optimize. To improve the performance of your application, identify the areas of your application that do not operate at peak efficiency. This section contains information on these subjects:
Task 1: Measure TopLink Performance with the TopLink Profiler
Task 2: Identify Sources of Application Performance Problems
Oracle TopLink provides a diverse set of features to measure and optimize application performance. You can enable or disable most features in the descriptors or session, making any resulting performance gains global.Performance considerations are present at every step of the development cycle. Although this implies an awareness of performance issues in your design and implementation, it does not mean that you should expect to achieve the best possible performance in your first pass.
For example, if optimization complicates the design, leave it until the final development phase. You should still plan for these optimizations from your first iteration, to make them easier to integrate later.
The most important concept associated with tuning your TopLink application is the idea of an iterative approach. The most effective way to tune your application is to do the following tasks:
The TopLink performance profiler helps you identify performance problems by logging performance statistics for every executed query in a given session.
The TopLink performance profiler logs the following information to the log file.
Table 9-1 Information Logged by the TopLink Performance Profiler
Information Logged | Description |
---|---|
Query Class |
Query class name. |
Domain Class |
Domain class name. |
Total Time |
Total execution time of the query, including any nested queries (in milliseconds). |
Local Time |
Execution time of the query, excluding any nested queries (in milliseconds). |
Number of Objects |
The total number of objects affected. |
Number of Objects Handled per Second |
How many objects were handled per second of transaction time. |
Logging |
the amount of time spent printing logging messages (in milliseconds). |
SQL Prepare |
The amount of time spent preparing the SQL script (in milliseconds). |
SQL Execute |
The amount of time spent executing the SQL script (in milliseconds). |
Row Fetch |
The amount of time spent fetching rows from the database (in milliseconds) |
Cache |
The amount of time spent searching or updating the object cache (in milliseconds) |
Object Build |
The amount of time spent building the domain object (in milliseconds) |
query Prepare |
the amount of time spent to prepare the query prior to execution (in milliseconds) |
SQL Generation |
the amount of time spent to generate the SQL script before it is sent to the database (in milliseconds) |
The TopLink performance profiler is an instance of org.eclipse.persistence.tools.profiler.PerformanceProfiler
class. To enable it, add the following line to the persistence.xml
file:
<property name="eclipselink.profiler" value="PerformanceProfiler.logProfiler"/>
In addition to enabling the TopLink profiler, The PerformanceProfiler
class public API also provides the functionality describes in Table 9-2:
Table 9-2 Additional PerformanceProfiler Functionality
To... | Use... |
---|---|
Disable the profiler |
|
Organize the profiler log into a summary of all the individual operation profiles including operation statistics like the shortest time of all the operations that were profiled, the total time of all the operations, the number of objects returned by profiled queries, and the total time that was spent in each kind of operation that was profiled |
|
Organize the profiler log into a summary of all the individual operation profiles by query |
|
Organize the profiler log into a summary of all the individual operation profiles by class. |
|
You can see profiling results by opening the profile log in a text reader, such as Notepad.
The profiler output file indicates the health of a TopLink-enabled application.
Example 9-4 shows an sample of the TopLink profiler output.
Example 9-4 Performance Profiler Output
Begin Profile of{ ReadAllQuery(com.demos.employee.domain.Employee) Profile(ReadAllQuery,# of obj=12, time=139923809,sql execute=21723809, prepare=49523809, row fetch=39023809, time/obj=11623809,obj/sec=8) } End Profile
Example 9-4 shows the following information about the query:
ReadAllQuery(com.demos.employee.domain.Employee)
: specific query profiled, and its arguments.
Profile(ReadAllQuery
: start of the profile and the type of query.
# of obj=12
: number of objects involved in the query.
time=139923809
: total execution time of the query (in milliseconds).
sql execute=21723809
: total time spent executing the SQL statement.
prepare=49523809
: total time spent preparing the SQL statement.
row fetch=39023809
: total time spent fetching rows from the database.
time/obj=116123809
: number of nanoseconds spent on each object.
obj/sec=8
: number of objects handled per second.
Areas of the application where performance problems could occur include the following:
Identifying General Performance Optimization
Schema
Mappings and Descriptors
Sessions
Cache
Data Access
Queries
Unit of Work
Application Server and Database Optimization
Task 3: Modify Poorly-Performing Application Components provides some guidelines for dealing with problems in each of these areas.
For each source of application performance problems listed in Section 9.2.3, "Task 2: Identify Sources of Application Performance Problems", you can try specific workarounds, as described in this section.
Avoid overriding TopLink default behavior unless your application requires it. Some of these defaults are suitable for a development environment; you should change these defaults to suit your production environment (see "Optimizing for a Production Environment" in the EclipseLink online documentation at http://wiki.eclipse.org/Optimizing_the_EclipseLink_Application_%28ELUG%29#Optimizing_for_a_Production_Environment
).
Use the Workbench rather than manual coding. These tools are not only easy to use: the default configuration they export to deployment XML (and the code it generates, if required) represents best practices optimized for most applications.
Optimization is an important consideration when you design your database schema and object model. Most performance issues occur when the object model or database schema is too complex, as this can make the database slow and difficult to query. This is most likely to happen if you derive your database schema directly from a complex object model.
To optimize performance, design the object model and database schema together. However, allow each model to be designed optimally: do not require a direct one-to-one correlation between the two.
"Optimizing Schema", in the EclipseLink online documentation (http://wiki.eclipse.org/Optimizing_the_EclipseLink_Application_%28ELUG%29#Optimizing_Schema
) includes four schema optimization scenarios that should help you design schema that provides the desired performance.
If you find performance bottlenecks in your mapping and descriptors, try these solutions:
Always use indirection (lazy loading). It is not only critical in optimizing database access, but also allows TopLink to make several other optimizations including optimizing its cache access and unit of work processing. See "Configuring Indirection (Lazy Loading)" in the EclipseLink online documentation, at:
http://wiki.eclipse.org/Configuring_a_Mapping_%28ELUG%29#Configuring_Indirection_.28Lazy_Loading.29
Avoid using method access in your TopLink mappings, especially if you have expensive or potentially dangerous side-effect code in your get or set methods; use the default direct attribute access instead. See "Configuring Method or Direct Field Accessing at the Mapping Level" in the EclipseLink online documentation, at:
Avoid using the existence checking option checkCacheThenDatabase on descriptors, unless required by the application. The default existence checking behavior offers better performance. See "Configuring Cache Existence Checking at the Descriptor Level" in the EclipseLink online documentation, at:
Avoid expensive initialization in the default constructor that TopLink uses to instantiate objects. Instead, use lazy initialization or use an TopLink instantiation policy to configure the descriptor to use a different constructor. See Configuring Instantiation Policy" in the EclipseLink online documentation, at:
http://wiki.eclipse.org/Configuring_a_Descriptor_%28ELUG%29#Configuring_Instantiation_Policy
If you suspect that session performance hindering your application, try these solutions:
Use a server session in a server environment instead of a database session.
Use the TopLink client session instead of remote session. A client session is appropriate for most multiuser Java EE application server environments.
Do not pool client sessions. Pooling sessions offers no performance gains.
Increase the size of your session read and write connection pools to the desired number of concurrent threads (for example, 50). You can configure this in TopLink when you are using an internal connection pool or in the data source when you are using an external connection pool.
For a list of additional resources, see "Optimizing Sessions" in the EclipseLink online documentation, at:
http://wiki.eclipse.org/Optimizing_the_EclipseLink_Application_%28ELUG%29#Optimizing_Sessions
You can often improve cache performance by implementing cache coordination. Cache coordination allows multiple, possibly distributed instances of a session to broadcast object changes among each other so that each session's cache can be kept up-to-date. For detailed information on optimizing cache behavior, see "Optimizing Cache" in the EclipseLink online documentation, at:
http://wiki.eclipse.org/Optimizing_the_EclipseLink_Application_%28ELUG%29#Optimizing_Cache
Depending on the type of data source your application accesses, TopLink offers a variety of Login
options that you can use to tune the performance of low level data reads and writes. For optimizing higher-level data reads and writes, "Optimizing Data Access" (in the EclipseLink online documentation at http://wiki.eclipse.org/Optimizing_the_EclipseLink_Application_%28ELUG%29#Optimizing_Data_Access
) offers several techniques to improve data access performance for your application. These techniques show you how to:
Optimize JDBC driver properties.
Optimize data format.
Use batch writing for optimization.
Use Outer-Join Reading with Inherited Subclasses.
Use Parameterized SQL (Parameter Binding) and Prepared Statement Caching for Optimization.
TopLink provides an extensive query API for reading, writing, and updating data. "Optimizing Queries" (in the EclipseLink online documentation at http://wiki.eclipse.org/Optimizing_the_EclipseLink_Application_%28ELUG%29#Optimizing_Queries
) offers several techniques to improve query performance for your application. These techniques show you how to:
Use parameterized SQL and prepared statement caching for optimization.
Use named queries for optimization.
Use batch and join reading for optimization.
Use partial object queries and fetch groups for optimization.
Use read-only queries for optimization.
Use JDBC fetch size for optimization.
Use cursored streams and scrollable cursors for optimization.
Use result set pagination for optimization.
It also includes links to read and write optimization examples.
To obtain optimal performance when using a unit of work, consider the following tips:
Register objects with a unit of work only if objects are eligible for change. If you register objects that will not change, the unit of work needlessly clones and processes those objects.
Avoid the cost of existence checking when you are registering a new or existing object. For more information, see "How to Use Registration and Existence Checking" in the EclipseLink online documentation at:
Avoid the cost of change set calculation on a class you know will not change by telling the unit of work that the class is read-only. For more information, see "Declaring Read-Only Classes" in the EclipseLink online documentation at:
http://wiki.eclipse.org/Using_Advanced_Unit_of_Work_API_%28ELUG%29#Declaring_Read-Only_Classes
Avoid the cost of change set calculation on an object read by a ReadAllQuery
in a unit of work that you do not intend to change by unregistering the object. For more information, see "How to Unregister Working Clones" in the EclipseLink online documentation at:
http://wiki.eclipse.org/Using_Advanced_Unit_of_Work_API_%28ELUG%29#How_to_Unregister_Working_Clones
Before using conforming queries, be sure that it is necessary. For alternatives, see "Using Conforming Queries and Descriptors" in the EclipseLink online documentation at:
Enable weaving and change tracking to greatly improve transactional performance. For more information, see "Optimizing Using Weaving" in the EclipseLink online documentation at:
http://wiki.eclipse.org/Optimizing_the_EclipseLink_Application_%28ELUG%29#Optimizing_Using_Weaving
If your performance measurements show that you have a performance problem during unit of work commit, consider using object level or attribute level change tracking, depending on the type of objects involved and how they typically change. For more information, see "Unit of Work and Change Policy" in the EclipseLink online documentation at:
To optimize the application server and database performance, consider these techniques:
Configuring your application server and database correctly can have a big impact on performance and scalability. Ensure that you correctly optimize these key components of your application in addition to your TopLink application and persistence.
For your application or Java EE server, ensure your memory, thread pool and connection pool sizes are sufficient for your server's expected load, and that your JVM has been configured optimally.
Ensure that your database has been configured correctly for optimal performance and its expected load.
Finally, after identifying possible performance bottlenecks and taking some action on them, rerun your application, again with the profiler enabled (see Section 9.2.2.1, "Enabling the TopLink Profiler"). Review the results and, if more action is required, follow the procedures outlined in Section 9.2.4, "Task 3: Modify Poorly-Performing Application Components".