Class CosineSimilarity


  • public class CosineSimilarity
    extends java.lang.Object
    Measures the Cosine similarity of two vectors of an inner product space and compares the angle between them.

    For further explanation about the Cosine Similarity, refer to http://en.wikipedia.org/wiki/Cosine_similarity.

    Since:
    1.0
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.lang.Double cosineSimilarity​(java.util.Map<java.lang.CharSequence,​java.lang.Integer> leftVector, java.util.Map<java.lang.CharSequence,​java.lang.Integer> rightVector)
      Calculates the cosine similarity for two given vectors.
      private double dot​(java.util.Map<java.lang.CharSequence,​java.lang.Integer> leftVector, java.util.Map<java.lang.CharSequence,​java.lang.Integer> rightVector, java.util.Set<java.lang.CharSequence> intersection)
      Computes the dot product of two vectors.
      private java.util.Set<java.lang.CharSequence> getIntersection​(java.util.Map<java.lang.CharSequence,​java.lang.Integer> leftVector, java.util.Map<java.lang.CharSequence,​java.lang.Integer> rightVector)
      Returns a set with strings common to the two given maps.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • CosineSimilarity

        public CosineSimilarity()
    • Method Detail

      • cosineSimilarity

        public java.lang.Double cosineSimilarity​(java.util.Map<java.lang.CharSequence,​java.lang.Integer> leftVector,
                                                 java.util.Map<java.lang.CharSequence,​java.lang.Integer> rightVector)
        Calculates the cosine similarity for two given vectors.
        Parameters:
        leftVector - left vector
        rightVector - right vector
        Returns:
        cosine similarity between the two vectors
      • dot

        private double dot​(java.util.Map<java.lang.CharSequence,​java.lang.Integer> leftVector,
                           java.util.Map<java.lang.CharSequence,​java.lang.Integer> rightVector,
                           java.util.Set<java.lang.CharSequence> intersection)
        Computes the dot product of two vectors. It ignores remaining elements. It means that if a vector is longer than other, then a smaller part of it will be used to compute the dot product.
        Parameters:
        leftVector - left vector
        rightVector - right vector
        intersection - common elements
        Returns:
        The dot product
      • getIntersection

        private java.util.Set<java.lang.CharSequence> getIntersection​(java.util.Map<java.lang.CharSequence,​java.lang.Integer> leftVector,
                                                                      java.util.Map<java.lang.CharSequence,​java.lang.Integer> rightVector)
        Returns a set with strings common to the two given maps.
        Parameters:
        leftVector - left vector map
        rightVector - right vector map
        Returns:
        common strings