Package org.apache.lucene.analysis.lv
Class LatvianStemmer
- java.lang.Object
-
- org.apache.lucene.analysis.lv.LatvianStemmer
-
public class LatvianStemmer extends java.lang.Object
Light stemmer for Latvian.This is a light version of the algorithm in Karlis Kreslin's PhD thesis A stemming algorithm for Latvian with the following modifications:
- Only explicitly stems noun and adjective morphology
- Stricter length/vowel checks for the resulting stems (verb etc suffix stripping is removed)
- Removes only the primary inflectional suffixes: case and number for nouns ; case, number, gender, and definitiveness for adjectives.
- Palatalization is only handled when a declension II,V,VI noun suffix is removed.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
LatvianStemmer.Affix
-
Field Summary
Fields Modifier and Type Field Description (package private) static LatvianStemmer.Affix[]
affixes
-
Constructor Summary
Constructors Constructor Description LatvianStemmer()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private int
numVowels(char[] s, int len)
Count the vowels in the string, we always require at least one in the remaining stem to accept it.int
stem(char[] s, int len)
Stem a latvian word.private int
unpalatalize(char[] s, int len)
Most cases are handled except for the ambiguous ones: s -> š t -> š d -> ž z -> ž
-
-
-
Field Detail
-
affixes
static final LatvianStemmer.Affix[] affixes
-
-
Method Detail
-
stem
public int stem(char[] s, int len)
Stem a latvian word. returns the new adjusted length.
-
unpalatalize
private int unpalatalize(char[] s, int len)
Most cases are handled except for the ambiguous ones:- s -> š
- t -> š
- d -> ž
- z -> ž
-
numVowels
private int numVowels(char[] s, int len)
Count the vowels in the string, we always require at least one in the remaining stem to accept it.
-
-