Class BinaryDictionary

    • Field Detail

      • DICT_FILENAME_SUFFIX

        public static final java.lang.String DICT_FILENAME_SUFFIX
        See Also:
        Constant Field Values
      • TARGETMAP_FILENAME_SUFFIX

        public static final java.lang.String TARGETMAP_FILENAME_SUFFIX
        See Also:
        Constant Field Values
      • POSDICT_FILENAME_SUFFIX

        public static final java.lang.String POSDICT_FILENAME_SUFFIX
        See Also:
        Constant Field Values
      • TARGETMAP_HEADER

        public static final java.lang.String TARGETMAP_HEADER
        See Also:
        Constant Field Values
      • buffer

        private final java.nio.ByteBuffer buffer
      • targetMapOffsets

        private final int[] targetMapOffsets
      • targetMap

        private final int[] targetMap
      • posDict

        private final java.lang.String[] posDict
      • inflTypeDict

        private final java.lang.String[] inflTypeDict
      • inflFormDict

        private final java.lang.String[] inflFormDict
      • HAS_BASEFORM

        public static final int HAS_BASEFORM
        flag that the entry has baseform data. otherwise it's not inflected (same as surface form)
        See Also:
        Constant Field Values
      • HAS_READING

        public static final int HAS_READING
        flag that the entry has reading data. otherwise reading is surface form converted to katakana
        See Also:
        Constant Field Values
      • HAS_PRONUNCIATION

        public static final int HAS_PRONUNCIATION
        flag that the entry has pronunciation data. otherwise pronunciation is the reading
        See Also:
        Constant Field Values
    • Constructor Detail

      • BinaryDictionary

        protected BinaryDictionary​(IOSupplier<java.io.InputStream> targetMapResource,
                                   IOSupplier<java.io.InputStream> posResource,
                                   IOSupplier<java.io.InputStream> dictResource)
                            throws java.io.IOException
        Throws:
        java.io.IOException
    • Method Detail

      • populateTargetMap

        private static void populateTargetMap​(DataInput in,
                                              int[] targetMap,
                                              int[] targetMapOffsets)
                                       throws java.io.IOException
        Throws:
        java.io.IOException
      • populatePosDict

        private static void populatePosDict​(DataInput in,
                                            int posSize,
                                            java.lang.String[] posDict,
                                            java.lang.String[] inflTypeDict,
                                            java.lang.String[] inflFormDict)
                                     throws java.io.IOException
        Throws:
        java.io.IOException
      • getResource

        @Deprecated(forRemoval=true,
                    since="9.1")
        public static final java.io.InputStream getResource​(BinaryDictionary.ResourceScheme scheme,
                                                            java.lang.String path)
                                                     throws java.io.IOException
        Deprecated, for removal: This API element is subject to removal in a future version.
        Throws:
        java.io.IOException
      • lookupWordIds

        public void lookupWordIds​(int sourceId,
                                  IntsRef ref)
      • getLeftId

        public int getLeftId​(int wordId)
        Description copied from interface: Dictionary
        Get left id of specified word
        Specified by:
        getLeftId in interface Dictionary
        Returns:
        left id
      • getRightId

        public int getRightId​(int wordId)
        Description copied from interface: Dictionary
        Get right id of specified word
        Specified by:
        getRightId in interface Dictionary
        Returns:
        right id
      • getWordCost

        public int getWordCost​(int wordId)
        Description copied from interface: Dictionary
        Get word cost of specified word
        Specified by:
        getWordCost in interface Dictionary
        Returns:
        word's cost
      • getBaseForm

        public java.lang.String getBaseForm​(int wordId,
                                            char[] surfaceForm,
                                            int off,
                                            int len)
        Description copied from interface: Dictionary
        Get base form of word
        Specified by:
        getBaseForm in interface Dictionary
        Parameters:
        wordId - word ID of token
        Returns:
        Base form (only different for inflected words, otherwise null)
      • getReading

        public java.lang.String getReading​(int wordId,
                                           char[] surface,
                                           int off,
                                           int len)
        Description copied from interface: Dictionary
        Get reading of tokens
        Specified by:
        getReading in interface Dictionary
        Parameters:
        wordId - word ID of token
        Returns:
        Reading of the token
      • getPartOfSpeech

        public java.lang.String getPartOfSpeech​(int wordId)
        Description copied from interface: Dictionary
        Get Part-Of-Speech of tokens
        Specified by:
        getPartOfSpeech in interface Dictionary
        Parameters:
        wordId - word ID of token
        Returns:
        Part-Of-Speech of the token
      • getPronunciation

        public java.lang.String getPronunciation​(int wordId,
                                                 char[] surface,
                                                 int off,
                                                 int len)
        Description copied from interface: Dictionary
        Get pronunciation of tokens
        Specified by:
        getPronunciation in interface Dictionary
        Parameters:
        wordId - word ID of token
        Returns:
        Pronunciation of the token
      • getInflectionType

        public java.lang.String getInflectionType​(int wordId)
        Description copied from interface: Dictionary
        Get inflection type of tokens
        Specified by:
        getInflectionType in interface Dictionary
        Parameters:
        wordId - word ID of token
        Returns:
        inflection type, or null
      • getInflectionForm

        public java.lang.String getInflectionForm​(int wordId)
        Description copied from interface: Dictionary
        Get inflection form of tokens
        Specified by:
        getInflectionForm in interface Dictionary
        Parameters:
        wordId - word ID of token
        Returns:
        inflection form, or null
      • baseFormOffset

        private static int baseFormOffset​(int wordId)
      • readingOffset

        private int readingOffset​(int wordId)
      • pronunciationOffset

        private int pronunciationOffset​(int wordId)
      • hasBaseFormData

        private boolean hasBaseFormData​(int wordId)
      • hasReadingData

        private boolean hasReadingData​(int wordId)
      • hasPronunciationData

        private boolean hasPronunciationData​(int wordId)
      • readString

        private java.lang.String readString​(int offset,
                                            int length,
                                            boolean kana)