Package org.apache.lucene.analysis.en
Class PorterStemmer
- java.lang.Object
-
- org.apache.lucene.analysis.en.PorterStemmer
-
class PorterStemmer extends java.lang.Object
Stemmer, implementing the Porter Stemming AlgorithmThe Stemmer class transforms a word into its root form. The input word can be provided a character at time (by calling add()), or at once by calling one of the various stem(something) methods.
-
-
Constructor Summary
Constructors Constructor Description PorterStemmer()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
add(char ch)
Add a character to the word being stemmed.private boolean
cons(int i)
private boolean
cvc(int i)
private boolean
doublec(int j)
private boolean
ends(java.lang.String s)
char[]
getResultBuffer()
Returns a reference to a character buffer containing the results of the stemming process.int
getResultLength()
Returns the length of the word resulting from the stemming process.private int
m()
(package private) void
r(java.lang.String s)
void
reset()
reset() resets the stemmer so it can stem another word.(package private) void
setto(java.lang.String s)
boolean
stem()
Stem the word placed into the Stemmer buffer through calls to add().boolean
stem(char[] word)
Stem a word contained in a char[].boolean
stem(char[] word, int wordLen)
Stem a word contained in a leading portion of a char[] array.boolean
stem(char[] wordBuffer, int offset, int wordLen)
Stem a word contained in a portion of a char[] array.boolean
stem(int i0)
java.lang.String
stem(java.lang.String s)
Stem a word provided as a String.private void
step1()
private void
step2()
private void
step3()
private void
step4()
private void
step5()
private void
step6()
java.lang.String
toString()
After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)private boolean
vowelinstem()
-
-
-
Field Detail
-
b
private char[] b
-
i
private int i
-
j
private int j
-
k
private int k
-
k0
private int k0
-
dirty
private boolean dirty
-
INITIAL_SIZE
private static final int INITIAL_SIZE
- See Also:
- Constant Field Values
-
-
Method Detail
-
reset
public void reset()
reset() resets the stemmer so it can stem another word. If you invoke the stemmer by calling add(char) and then stem(), you must call reset() before starting another word.
-
add
public void add(char ch)
Add a character to the word being stemmed. When you are finished adding characters, you can call stem(void) to process the word.
-
toString
public java.lang.String toString()
After a word has been stemmed, it can be retrieved by toString(), or a reference to the internal buffer can be retrieved by getResultBuffer and getResultLength (which is generally more efficient.)- Overrides:
toString
in classjava.lang.Object
-
getResultLength
public int getResultLength()
Returns the length of the word resulting from the stemming process.
-
getResultBuffer
public char[] getResultBuffer()
Returns a reference to a character buffer containing the results of the stemming process. You also need to consult getResultLength() to determine the length of the result.
-
cons
private final boolean cons(int i)
-
m
private final int m()
-
vowelinstem
private final boolean vowelinstem()
-
doublec
private final boolean doublec(int j)
-
cvc
private final boolean cvc(int i)
-
ends
private final boolean ends(java.lang.String s)
-
setto
void setto(java.lang.String s)
-
r
void r(java.lang.String s)
-
step1
private final void step1()
-
step2
private final void step2()
-
step3
private final void step3()
-
step4
private final void step4()
-
step5
private final void step5()
-
step6
private final void step6()
-
stem
public java.lang.String stem(java.lang.String s)
Stem a word provided as a String. Returns the result as a String.
-
stem
public boolean stem(char[] word)
Stem a word contained in a char[]. Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString().
-
stem
public boolean stem(char[] wordBuffer, int offset, int wordLen)
Stem a word contained in a portion of a char[] array. Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString().
-
stem
public boolean stem(char[] word, int wordLen)
Stem a word contained in a leading portion of a char[] array. Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString().
-
stem
public boolean stem()
Stem the word placed into the Stemmer buffer through calls to add(). Returns true if the stemming process resulted in a word different from the input. You can retrieve the result with getResultLength()/getResultBuffer() or toString().
-
stem
public boolean stem(int i0)
-
-