Class ICUNormalizer2CharFilter

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable, java.lang.Readable

    public final class ICUNormalizer2CharFilter
    extends BaseCharFilter
    Normalize token text with ICU's Normalizer2.
    • Field Detail

      • normalizer

        private final com.ibm.icu.text.Normalizer2 normalizer
      • inputBuffer

        private final java.lang.StringBuilder inputBuffer
      • resultBuffer

        private final java.lang.StringBuilder resultBuffer
      • inputFinished

        private boolean inputFinished
      • afterQuickCheckYes

        private boolean afterQuickCheckYes
      • checkedInputBoundary

        private int checkedInputBoundary
      • charCount

        private int charCount
    • Constructor Detail

      • ICUNormalizer2CharFilter

        public ICUNormalizer2CharFilter​(java.io.Reader in)
        Create a new Normalizer2CharFilter that combines NFKC normalization, Case Folding, and removes Default Ignorables (NFKC_Casefold)
      • ICUNormalizer2CharFilter

        public ICUNormalizer2CharFilter​(java.io.Reader in,
                                        com.ibm.icu.text.Normalizer2 normalizer)
        Create a new Normalizer2CharFilter with the specified Normalizer2
        Parameters:
        in - text
        normalizer - normalizer to use
      • ICUNormalizer2CharFilter

        ICUNormalizer2CharFilter​(java.io.Reader in,
                                 com.ibm.icu.text.Normalizer2 normalizer,
                                 int bufferSize)
    • Method Detail

      • read

        public int read​(char[] cbuf,
                        int off,
                        int len)
                 throws java.io.IOException
        Specified by:
        read in class java.io.Reader
        Throws:
        java.io.IOException
      • readInputToBuffer

        private void readInputToBuffer()
                                throws java.io.IOException
        Throws:
        java.io.IOException
      • readAndNormalizeFromInput

        private int readAndNormalizeFromInput()
      • readFromInputWhileSpanQuickCheckYes

        private int readFromInputWhileSpanQuickCheckYes()
      • readFromIoNormalizeUptoBoundary

        private int readFromIoNormalizeUptoBoundary()
      • normalizeInputUpto

        private int normalizeInputUpto​(int length)
      • recordOffsetDiff

        private void recordOffsetDiff​(int inputLength,
                                      int outputLength)
      • outputFromResultBuffer

        private int outputFromResultBuffer​(char[] cbuf,
                                           int begin,
                                           int len)