Closeable
, AutoCloseable
public final class ClassicAnalyzer extends StopwordAnalyzerBase
ClassicTokenizer
with ClassicFilter
, LowerCaseFilter
and StopFilter
, using a list of
English stop words.
You must specify the required Version
compatibility when creating ClassicAnalyzer:
StandardAnalyzer
implements Unicode text segmentation,
as specified by UAX#29.ReusableAnalyzerBase.TokenStreamComponents
Modifier and Type | Field | Description |
---|---|---|
static int |
DEFAULT_MAX_TOKEN_LENGTH |
Default maximum allowed token length
|
static Set<?> |
STOP_WORDS_SET |
An unmodifiable set containing some common English words that are usually not
useful for searching.
|
matchVersion, stopwords
Constructor | Description |
---|---|
ClassicAnalyzer(Version matchVersion) |
Builds an analyzer with the default stop words (
STOP_WORDS_SET ). |
ClassicAnalyzer(Version matchVersion,
File stopwords) |
Deprecated.
Use
ClassicAnalyzer(Version, Reader) instead. |
ClassicAnalyzer(Version matchVersion,
Reader stopwords) |
Builds an analyzer with the stop words from the given reader.
|
ClassicAnalyzer(Version matchVersion,
Set<?> stopWords) |
Builds an analyzer with the given stop words.
|
Modifier and Type | Method | Description |
---|---|---|
protected ReusableAnalyzerBase.TokenStreamComponents |
createComponents(String fieldName,
Reader reader) |
Creates a new
ReusableAnalyzerBase.TokenStreamComponents instance for this analyzer. |
int |
getMaxTokenLength() |
|
void |
setMaxTokenLength(int length) |
Set maximum allowed token length.
|
close, getOffsetGap, getPositionIncrementGap, getPreviousTokenStream, setPreviousTokenStream
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
initReader, reusableTokenStream, tokenStream
getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSet
public static final int DEFAULT_MAX_TOKEN_LENGTH
public static final Set<?> STOP_WORDS_SET
public ClassicAnalyzer(Version matchVersion, Set<?> stopWords)
matchVersion
- Lucene version to match See {@link
above}stopWords
- stop wordspublic ClassicAnalyzer(Version matchVersion)
STOP_WORDS_SET
).matchVersion
- Lucene version to match See {@link
above}@Deprecated public ClassicAnalyzer(Version matchVersion, File stopwords) throws IOException
ClassicAnalyzer(Version, Reader)
instead.matchVersion
- Lucene version to match See {@link
above}stopwords
- File to read stop words fromIOException
WordlistLoader.getWordSet(Reader, Version)
public ClassicAnalyzer(Version matchVersion, Reader stopwords) throws IOException
matchVersion
- Lucene version to match See {@link
above}stopwords
- Reader to read stop words fromIOException
WordlistLoader.getWordSet(Reader, Version)
public void setMaxTokenLength(int length)
public int getMaxTokenLength()
setMaxTokenLength(int)
protected ReusableAnalyzerBase.TokenStreamComponents createComponents(String fieldName, Reader reader)
ReusableAnalyzerBase
ReusableAnalyzerBase.TokenStreamComponents
instance for this analyzer.createComponents
in class ReusableAnalyzerBase
fieldName
- the name of the fields content passed to the
ReusableAnalyzerBase.TokenStreamComponents
sink as a readerreader
- the reader passed to the Tokenizer
constructorReusableAnalyzerBase.TokenStreamComponents
for this analyzer.Copyright © 2000-2018 Apache Software Foundation. All Rights Reserved.