Package org.apache.lucene.analysis.id
Class IndonesianStemmer
- java.lang.Object
-
- org.apache.lucene.analysis.id.IndonesianStemmer
-
public class IndonesianStemmer extends java.lang.Object
Stemmer for Indonesian.Stems Indonesian words with the algorithm presented in: A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia, Fadillah Z Tala. http://www.illc.uva.nl/Publications/ResearchReports/MoL-2003-02.text.pdf
-
-
Field Summary
Fields Modifier and Type Field Description private int
flags
private int
numSyllables
private static int
REMOVED_BER
private static int
REMOVED_DI
private static int
REMOVED_KE
private static int
REMOVED_MENG
private static int
REMOVED_PE
private static int
REMOVED_PENG
private static int
REMOVED_TER
-
Constructor Summary
Constructors Constructor Description IndonesianStemmer()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private boolean
isVowel(char ch)
private int
removeFirstOrderPrefix(char[] text, int length)
private int
removeParticle(char[] text, int length)
private int
removePossessivePronoun(char[] text, int length)
private int
removeSecondOrderPrefix(char[] text, int length)
private int
removeSuffix(char[] text, int length)
int
stem(char[] text, int length, boolean stemDerivational)
Stem a term (returning its new length).private int
stemDerivational(char[] text, int length)
-
-
-
Field Detail
-
numSyllables
private int numSyllables
-
flags
private int flags
-
REMOVED_KE
private static final int REMOVED_KE
- See Also:
- Constant Field Values
-
REMOVED_PENG
private static final int REMOVED_PENG
- See Also:
- Constant Field Values
-
REMOVED_DI
private static final int REMOVED_DI
- See Also:
- Constant Field Values
-
REMOVED_MENG
private static final int REMOVED_MENG
- See Also:
- Constant Field Values
-
REMOVED_TER
private static final int REMOVED_TER
- See Also:
- Constant Field Values
-
REMOVED_BER
private static final int REMOVED_BER
- See Also:
- Constant Field Values
-
REMOVED_PE
private static final int REMOVED_PE
- See Also:
- Constant Field Values
-
-
Method Detail
-
stem
public int stem(char[] text, int length, boolean stemDerivational)
Stem a term (returning its new length).Use
stemDerivational
to control whether full stemming or only light inflectional stemming is done.
-
stemDerivational
private int stemDerivational(char[] text, int length)
-
isVowel
private boolean isVowel(char ch)
-
removeParticle
private int removeParticle(char[] text, int length)
-
removePossessivePronoun
private int removePossessivePronoun(char[] text, int length)
-
removeFirstOrderPrefix
private int removeFirstOrderPrefix(char[] text, int length)
-
removeSecondOrderPrefix
private int removeSecondOrderPrefix(char[] text, int length)
-
removeSuffix
private int removeSuffix(char[] text, int length)
-
-