Lucene 1.2 API

Jakarta Lucene API


org.apache.lucene.analysis API and code to convert text into indexable tokens. Support for indexing and searching of German text.
org.apache.lucene.analysis.standard A grammar-based tokenizer constructed with JavaCC.
org.apache.lucene.document The Document abstraction.
org.apache.lucene.index Code to maintain and access indices.
org.apache.lucene.queryParser A simple query parser implemented with JavaCC. Search over indices. Binary i/o API, for storing index data.
org.apache.lucene.util Some utility classes.


Jakarta Lucene API

The Jakarta Lucene API is divided into several packages: To use Lucene, an application should:
  1. Create Document's by adding Field's.
  2. Create an IndexWriter and add documents to to it with addDocument();
  3. Call QueryParser.parse() to build a query from a string; and
  4. Create an IndexSearcher and pass the query to it's search() method.
Some simple examples of code which does this are: To demonstrate these, try something like:
> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.IndexFiles
  [ ... ]

> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.SearchFiles
Query: chowder
Searching for: chowder
34 total matching documents
  [ ... thirty-four documents contain the word "chowder", "spam-chowder" with the greatest density.]

Query: path:chowder
Searching for: path:chowder
31 total matching documents
  [ ... only thrity-one have "chowder" in the "path" field. ]

Query: path:"clam chowder"
Searching for: path:"clam chowder"
10 total matching documents
  [ ... only ten have "clam chowder" in the "path" field. ]

Query: path:"clam chowder" AND manhattan
Searching for: +path:"clam chowder" +manhattan
2 total matching documents
  [ ... only two also have "manhattan" in the contents. ]
    [ Note: "+" and "-" are canonical, but "AND", "OR" and "NOT" may be used. ]

The IndexHtml demo is more sophisticated.  It incrementally maintains an index of HTML files, adding new files as they appear, deleting old files as they disappear and re-indexing files as they change.
> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.IndexHTML -create java/jdk1.1.6/docs/relnotes
adding java/jdk1.1.6/docs/relnotes/SMICopyright.html
  [ ... create an index containing all the relnotes ]

> rm java/jdk1.1.6/docs/relnotes/smicopyright.html

> java -cp lucene.jar:lucene-demo.jar org.apache.lucene.demo.IndexHTML java/jdk1.1.6/docs/relnotes
deleting java/jdk1.1.6/docs/relnotes/SMICopyright.html

HTML indexes are searched using SUN's JavaWebServer (JWS) and Search.jhtml.  To use this: Note that indexes can be updated while searches are going on.  Search.jhtml will re-open the index when it is updated so that the latest version is immediately available.

Copyright © 2000-2002 Apache Software Foundation. All Rights Reserved.