Package org.apache.lucene.analysis

API and code to convert text into indexable tokens.

See:
          Description

Class Summary
Analyzer An Analyzer builds TokenStreams, which analyze text.
CharTokenizer An abstract base class for simple, character-oriented tokenizers.
LetterTokenizer A LetterTokenizer is a tokenizer that divides text at non-letters.
LowerCaseFilter Normalizes token text to lower case.
LowerCaseTokenizer LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together.
PorterStemFilter Transforms the token stream as per the Porter stemming algorithm.
SimpleAnalyzer An Analyzer that filters LetterTokenizer with LowerCaseFilter.
StopAnalyzer Filters LetterTokenizer with LowerCaseFilter and StopFilter.
StopFilter Removes stop words from a token stream.
Token A Token is an occurence of a term from the text of a field.
TokenFilter A TokenFilter is a TokenStream whose input is another token stream.
Tokenizer A Tokenizer is a TokenStream whose input is a Reader.
TokenStream A TokenStream enumerates the sequence of tokens, either from fields of a document or from query text.
WhitespaceAnalyzer An Analyzer that uses WhitespaceTokenizer.
WhitespaceTokenizer A WhitespaceTokenizer is a tokenizer that divides text at whitespace.
 

Package org.apache.lucene.analysis Description

API and code to convert text into indexable tokens.



Copyright © 2000-2002 Apache Software Foundation. All Rights Reserved.