Class LuceneLevenshteinDistance

  • All Implemented Interfaces:
    StringDistance

    public final class LuceneLevenshteinDistance
    extends java.lang.Object
    implements StringDistance
    Damerau-Levenshtein (optimal string alignment) implemented in a consistent way as Lucene's FuzzyTermsEnum with the transpositions option enabled.

    Notes:

    • This metric treats full unicode codepoints as characters
    • This metric scales raw edit distances into a floating point score based upon the shortest of the two terms
    • Transpositions of two adjacent codepoints are treated as primitive edits.
    • Edits are applied in parallel: for example, "ab" and "bca" have distance 3.
    NOTE: this class is not particularly efficient. It is only intended for merging results from multiple DirectSpellCheckers.
    • Constructor Summary

      Constructors 
      Constructor Description
      LuceneLevenshteinDistance()
      Creates a new comparator, mimicing the behavior of Lucene's internal edit distance.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      boolean equals​(java.lang.Object obj)  
      float getDistance​(java.lang.String target, java.lang.String other)
      Returns a float between 0 and 1 based on how similar the specified strings are to one another.
      int hashCode()  
      private static IntsRef toIntsRef​(java.lang.String s)  
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • LuceneLevenshteinDistance

        public LuceneLevenshteinDistance()
        Creates a new comparator, mimicing the behavior of Lucene's internal edit distance.
    • Method Detail

      • getDistance

        public float getDistance​(java.lang.String target,
                                 java.lang.String other)
        Description copied from interface: StringDistance
        Returns a float between 0 and 1 based on how similar the specified strings are to one another. Returning a value of 1 means the specified strings are identical and 0 means the string are maximally different.
        Specified by:
        getDistance in interface StringDistance
        Parameters:
        target - The first string.
        other - The second string.
        Returns:
        a float between 0 and 1 based on how similar the specified strings are to one another.
      • toIntsRef

        private static IntsRef toIntsRef​(java.lang.String s)
      • equals

        public boolean equals​(java.lang.Object obj)
        Overrides:
        equals in class java.lang.Object
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object