Oracle Data Mining Java API Reference
10g Release 1 (10.1)

B12276-01

Package oracle.dmt.odm.transformation

This package contains Java supporting of data transformations.

See:
          Description

Class Summary
AttributeDiscretization The abstract class AttributeDiscretization specifies the discretization values of a particular attribute.
CategoricalDiscretization Associated with an attribute CategoricalDiscretization allows a user to specify category groups or use automated discretization (binning) involving the N most frequent items.
CategoryGroup An instance of class CategoryGroup allows grouping of similar categories.
DiscretizationSpecification An instance of DiscretizationSpecification contains discretization details for a single attribute.
NumericalBin An entry of class NumericalBin specifies explicit bin boundaries for a numerical mining attribute.
NumericalDiscretization NumericalDiscretization contains the binning details for a numerical attribute.
Transformation An instance of Transformation is used to prepare input data for use in data mining operations in ODM.

 

Package oracle.dmt.odm.transformation Description

This package contains Java supporting of data transformations. Discretization permits the user to group related values and significantly reduce attribute coordinality. This in then reduces complexity and execution time of mining operations.

Binning has a major impact on model accuracy. The best binning results when an expert creates the bins manually based on information about the data being binned and the problem being solved. Oracle Data Mining provides four ways to bin data:

  1. Automated binning: In cases where there is no clear way to determine how to define optimal bins or where ODM is used to get an initial understanding of the problem, ODM can do the binning. This is the ODM default for binning. (Refer to DataUsageSpecification, DataUsageEntry and DataPreparationStatus.)
  2. Explicit specification of bin boundaries for a given attribute: For a given attribute, a user provides
  3. Top N most frequent items: For categorical attributes (only), the user selects the value of N and the name of the "other" category. Oracle Data Mining automatically determines the N most frequent categories and places all other categories in the "other" category.
  4. Quantile binning: For numerical attributes (only), the user specifies the number of quantiles that the values are to be divided into, after the values are sorted. Oracle Data Mining automatically determines which values belong to which bins.

Copyright © 2003 Oracle Corporation. All Rights Reserved.