org.apache.lucene.index
Class IndexWriter

java.lang.Object
  |
  +--org.apache.lucene.index.IndexWriter

public final class IndexWriter
extends Object

An IndexWriter creates and maintains an index. The third argument to the constructor determines whether a new index is created, or whether an existing index is opened for the addition of new documents. In either case, documents are added with the addDocument method. When finished adding documents, close should be called. If an index will not have more documents added for a while and optimal search performance is desired, then the optimize method should be called before the index is closed.


Field Summary
 PrintStream infoStream
          If non-null, information about merges will be printed to this.
 int maxFieldLength
          The maximum number of terms that will be indexed for a single field in a document.
 int maxMergeDocs
          Determines the largest number of documents ever merged by addDocument().
 int mergeFactor
          Determines how often segment indexes are merged by addDocument().
 
Constructor Summary
IndexWriter(Directory d, Analyzer a, boolean create)
          Constructs an IndexWriter for the index in d.
IndexWriter(File path, Analyzer a, boolean create)
          Constructs an IndexWriter for the index in path.
IndexWriter(String path, Analyzer a, boolean create)
          Constructs an IndexWriter for the index in path.
 
Method Summary
 void addDocument(Document doc)
          Adds a document to this index.
 void addIndexes(Directory[] dirs)
          Merges all segments from an array of indexes into this index.
 void close()
          Flushes all changes to an index, closes all associated files, and closes the directory that the index is stored in.
 int docCount()
          Returns the number of documents currently in this index.
protected  void finalize()
          Release the write lock, if needed.
 void optimize()
          Merges all segments together into a single segment, optimizing an index for search.
 
Methods inherited from class java.lang.Object
clone, equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

maxFieldLength

public int maxFieldLength
The maximum number of terms that will be indexed for a single field in a document. This limits the amount of memory required for indexing, so that collections with very large files will not crash the indexing process by running out of memory.

By default, no more than 10,000 terms will be indexed for a field.


mergeFactor

public int mergeFactor
Determines how often segment indexes are merged by addDocument(). With smaller values, less RAM is used while indexing, and searches on unoptimized indexes are faster, but indexing speed is slower. With larger values more RAM is used while indexing and searches on unoptimized indexes are slower, but indexing is faster. Thus larger values (> 10) are best for batched index creation, and smaller values (< 10) for indexes that are interactively maintained.

This must never be less than 2. The default value is 10.


maxMergeDocs

public int maxMergeDocs
Determines the largest number of documents ever merged by addDocument(). Small values (e.g., less than 10,000) are best for interactive indexing, as this limits the length of pauses while indexing to a few seconds. Larger values are best for batched indexing and speedier searches.

The default value is Integer#MAX_VALUE.


infoStream

public PrintStream infoStream
If non-null, information about merges will be printed to this.
Constructor Detail

IndexWriter

public IndexWriter(String path,
                   Analyzer a,
                   boolean create)
            throws IOException
Constructs an IndexWriter for the index in path. Text will be analyzed with a. If create is true, then a new, empty index will be created in path, replacing the index already there, if any.

IndexWriter

public IndexWriter(File path,
                   Analyzer a,
                   boolean create)
            throws IOException
Constructs an IndexWriter for the index in path. Text will be analyzed with a. If create is true, then a new, empty index will be created in path, replacing the index already there, if any.

IndexWriter

public IndexWriter(Directory d,
                   Analyzer a,
                   boolean create)
            throws IOException
Constructs an IndexWriter for the index in d. Text will be analyzed with a. If create is true, then a new, empty index will be created in d, replacing the index already there, if any.
Method Detail

close

public final void close()
                 throws IOException
Flushes all changes to an index, closes all associated files, and closes the directory that the index is stored in.

finalize

protected final void finalize()
                       throws IOException
Release the write lock, if needed.
Overrides:
finalize in class Object

docCount

public final int docCount()
Returns the number of documents currently in this index.

addDocument

public final void addDocument(Document doc)
                       throws IOException
Adds a document to this index.

optimize

public final void optimize()
                    throws IOException
Merges all segments together into a single segment, optimizing an index for search.

addIndexes

public final void addIndexes(Directory[] dirs)
                      throws IOException
Merges all segments from an array of indexes into this index.

This may be used to parallelize batch indexing. A large document collection can be broken into sub-collections. Each sub-collection can be indexed in parallel, on a different thread, process or machine. The complete index can then be created by merging sub-collection indexes with this method.

After this completes, the index is optimized.



Copyright © 2000-2002 Apache Software Foundation. All Rights Reserved.