Class ByteLookupCharset

java.lang.Object
java.nio.charset.Charset
net.freeutils.charset.ByteLookupCharset
All Implemented Interfaces:
Comparable<Charset>
Direct Known Subclasses:
HPRoman8Charset, ISO88596Charset, ISO88598Charset, KOI8UCharset, MIKCharset

public abstract class ByteLookupCharset extends Charset
The ByteLookupCharset class handles the encoding and decoding of single-byte charsets where the byte-to-char conversion is performed using a simple lookup table.
Since:
2005-06-30
  • Constructor Details

    • ByteLookupCharset

      protected ByteLookupCharset(String canonicalName, String[] aliases, int[] byteToChar, int[][] charToByte)
      Initializes a new charset with the given canonical name and alias set, and byte-to-char/char-to-byte lookup tables.

      Parameters:
      canonicalName - The canonical name of this charset
      aliases - An array of this charset's aliases, or null if it has no aliases
      byteToChar - a byte-to-char conversion table for this charset
      charToByte - a char-to-byte conversion table for this charset. It can be generated on-the-fly by calling createInverseLookupTable(byteToChar).
      Throws:
      IllegalCharsetNameException - If the canonical name or any of the aliases are illegal
  • Method Details

    • mutate

      protected static int[] mutate(int[] src, int[] ind, int[] val)
      Returns a copy of the given array in which several items are modified.
      Parameters:
      src - the array to mutate
      ind - the array indices in which the values will be modified
      val - the respective values to place in these indices
      Returns:
      the mutated array
    • createInverseLookupTable

      public static int[][] createInverseLookupTable(int[] chars)
      Creates an inverse lookup table for the given byte-to-char lookup table. The returned table contains 256 tables, one per high-order byte of a potential character to be converted (unused ones are null), and each such table can be indexed using the character's low-order byte, to obtain the actual converted byte value. A null table in the top level table, or a -1 within a lower level table, both indicate that there is no legal mapping for the given character.
      Parameters:
      chars - a lookup table which holds the character value that each byte value (0-255) is converted to.
      Returns:
      the created inverse lookup (char-to-byte) table.
    • updateInverseLookupTable

      public static int[][] updateInverseLookupTable(int[][] tables, int c, int b)
      Updates an inverse lookup table with an additional mapping, replacing a previous mapping of the same value if it exists.
      Parameters:
      tables - the inverse lookup table to update (see createInverseLookupTable(int[]))
      c - the character to map
      b - the byte value to which c is mapped, or -1 to mark an illegal mapping
      Returns:
      the updated inverse lookup (char-to-byte) table
    • updateInverseLookupTable

      public static int[][] updateInverseLookupTable(int[][] tables, int[] chars, int[] bytes)
      Updates an inverse lookup table with additional mappings, replacing previous mappings of the same values if they exists.
      Parameters:
      tables - the inverse lookup table to update (see createInverseLookupTable(int[]))
      chars - the characters to map
      bytes - the respective byte values to which the chars are mapped, or -1 to mark an illegal mapping
      Returns:
      the updated inverse lookup (char-to-byte) table
    • createInverseLookupTableDefinition

      public static String createInverseLookupTableDefinition(int[] chars)
      Returns a string containing Java definitions of the inverse lookup table returned by getInverseLookupTable for the given byte-to-char lookup table. This is a convenient utility method for design-time building of charsets based on lookup table mapping, as an alternative to creating these inverse lookup tables on-the-fly.
      Parameters:
      chars - a lookup table which holds the character value that each byte value (0-255) is converted to.
      Returns:
      the Java definitions of the created inverse lookup (char-to-byte) table.
    • contains

      public boolean contains(Charset cs)
      Tells whether or not this charset contains the given charset.

      A charset C is said to contain a charset D if, and only if, every character representable in D is also representable in C. If this relationship holds then it is guaranteed that every string that can be encoded in D can also be encoded in C without performing any replacements.

      That C contains D does not imply that each character representable in C by a particular byte sequence is represented in D by the same byte sequence, although sometimes this is the case.

      Every charset contains itself.

      This method computes an approximation of the containment relation: If it returns true then the given charset is known to be contained by this charset; if it returns false, however, then it is not necessarily the case that the given charset is not contained in this charset.

      Specified by:
      contains in class Charset
      Returns:
      true if, and only if, the given charset is contained in this charset
    • newDecoder

      public CharsetDecoder newDecoder()
      Constructs a new decoder for this charset.

      Specified by:
      newDecoder in class Charset
      Returns:
      A new decoder for this charset
    • newEncoder

      public CharsetEncoder newEncoder()
      Constructs a new encoder for this charset.

      Specified by:
      newEncoder in class Charset
      Returns:
      A new encoder for this charset
      Throws:
      UnsupportedOperationException - If this charset does not support encoding