Class FuzzyLikeThisQuery
- java.lang.Object
-
- org.apache.lucene.search.Query
-
- org.apache.lucene.sandbox.queries.FuzzyLikeThisQuery
-
public class FuzzyLikeThisQuery extends Query
Fuzzifies ALL terms provided as strings and then picks the best n differentiating terms. In effect this mixes the behaviour of FuzzyQuery and MoreLikeThis but with special consideration of fuzzy scoring factors. This generally produces good results for queries where users may provide details in a number of fields and have no knowledge of boolean query syntax and also want a degree of fuzzy matching and a fast query.For each source term the fuzzy variants are held in a BooleanQuery with no coord factor (because we are not looking for matches on multiple variants in any one doc). Additionally, a specialized TermQuery is used for variants and does not use that variant term's IDF because this would favour rarer terms eg misspellings. Instead, all variants use the same IDF ranking (the one for the source query term) and this is factored into the variant's boost. If the source query term does not exist in the index the average IDF of the variants is used.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
FuzzyLikeThisQuery.FieldVals
private static class
FuzzyLikeThisQuery.ScoreTerm
private static class
FuzzyLikeThisQuery.ScoreTermQueue
-
Field Summary
Fields Modifier and Type Field Description (package private) Analyzer
analyzer
(package private) java.util.ArrayList<FuzzyLikeThisQuery.FieldVals>
fieldVals
(package private) boolean
ignoreTF
(package private) int
MAX_VARIANTS_PER_TERM
private int
maxNumTerms
(package private) static TFIDFSimilarity
sim
-
Constructor Summary
Constructors Constructor Description FuzzyLikeThisQuery(int maxNumTerms, Analyzer analyzer)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addTerms(java.lang.String queryString, java.lang.String fieldName, float minSimilarity, int prefixLength)
Adds user input for "fuzzification"private void
addTerms(IndexReader reader, FuzzyLikeThisQuery.FieldVals f, FuzzyLikeThisQuery.ScoreTermQueue q)
boolean
equals(java.lang.Object other)
Override and implement query instance equivalence properly in a subclass.private boolean
equalsTo(FuzzyLikeThisQuery other)
int
hashCode()
Override and implement query hash code properly in a subclass.boolean
isIgnoreTF()
private Query
newTermQuery(IndexReader reader, Term term)
Query
rewrite(IndexReader reader)
Expert: called to re-write queries into primitive queries.void
setIgnoreTF(boolean ignoreTF)
java.lang.String
toString(java.lang.String field)
Prints a query to a string, withfield
assumed to be the default field and omitted.void
visit(QueryVisitor visitor)
Recurse through the query tree, visiting any child queries-
Methods inherited from class org.apache.lucene.search.Query
classHash, createWeight, sameClassAs, toString
-
-
-
-
Field Detail
-
sim
static TFIDFSimilarity sim
-
fieldVals
java.util.ArrayList<FuzzyLikeThisQuery.FieldVals> fieldVals
-
analyzer
Analyzer analyzer
-
MAX_VARIANTS_PER_TERM
int MAX_VARIANTS_PER_TERM
-
ignoreTF
boolean ignoreTF
-
maxNumTerms
private int maxNumTerms
-
-
Constructor Detail
-
FuzzyLikeThisQuery
public FuzzyLikeThisQuery(int maxNumTerms, Analyzer analyzer)
- Parameters:
maxNumTerms
- The total number of terms clauses that will appear once rewritten as a BooleanQuery
-
-
Method Detail
-
hashCode
public int hashCode()
Description copied from class:Query
Override and implement query hash code properly in a subclass. This is required so thatQueryCache
works properly.- Specified by:
hashCode
in classQuery
- See Also:
Query.equals(Object)
-
equals
public boolean equals(java.lang.Object other)
Description copied from class:Query
Override and implement query instance equivalence properly in a subclass. This is required so thatQueryCache
works properly.Typically a query will be equal to another only if it's an instance of the same class and its document-filtering properties are identical that other instance. Utility methods are provided for certain repetitive code.
- Specified by:
equals
in classQuery
- See Also:
Query.sameClassAs(Object)
,Query.classHash()
-
equalsTo
private boolean equalsTo(FuzzyLikeThisQuery other)
-
addTerms
public void addTerms(java.lang.String queryString, java.lang.String fieldName, float minSimilarity, int prefixLength)
Adds user input for "fuzzification"- Parameters:
queryString
- The string which will be parsed by the analyzer and for which fuzzy variants will be parsedminSimilarity
- The minimum similarity of the term variants; must be 0, 1 or 2 (see FuzzyTermsEnum)prefixLength
- Length of required common prefix on variant terms (see FuzzyTermsEnum)
-
addTerms
private void addTerms(IndexReader reader, FuzzyLikeThisQuery.FieldVals f, FuzzyLikeThisQuery.ScoreTermQueue q) throws java.io.IOException
- Throws:
java.io.IOException
-
newTermQuery
private Query newTermQuery(IndexReader reader, Term term) throws java.io.IOException
- Throws:
java.io.IOException
-
visit
public void visit(QueryVisitor visitor)
Description copied from class:Query
Recurse through the query tree, visiting any child queries
-
rewrite
public Query rewrite(IndexReader reader) throws java.io.IOException
Description copied from class:Query
Expert: called to re-write queries into primitive queries. For example, a PrefixQuery will be rewritten into a BooleanQuery that consists of TermQuerys.Callers are expected to call
rewrite
multiple times if necessary, until the rewritten query is the same as the original query.- Overrides:
rewrite
in classQuery
- Throws:
java.io.IOException
- See Also:
IndexSearcher.rewrite(Query)
-
toString
public java.lang.String toString(java.lang.String field)
Description copied from class:Query
Prints a query to a string, withfield
assumed to be the default field and omitted.
-
isIgnoreTF
public boolean isIgnoreTF()
-
setIgnoreTF
public void setIgnoreTF(boolean ignoreTF)
-
-