Class _codecs


  • public class _codecs
    extends java.lang.Object
    This class corresponds to the Python _codecs module, which in turn lends its functions to the codecs module (in Lib/codecs.py). It exposes the implementing functions of several codec families called out in the Python codecs library Lib/encodings/*.py, where it is usually claimed that they are bound "as C functions". Obviously, C stands for "compiled" in this context, rather than dependence on a particular implementation language. Actual transcoding methods often come from the related codecs class.
    • Constructor Detail

      • _codecs

        public _codecs()
    • Method Detail

      • register

        public static void register​(PyObject search_function)
      • register_error

        public static void register_error​(java.lang.String name,
                                          PyObject errorHandler)
      • decode

        public static PyObject decode​(PyString bytes)
        Decode bytes using the system default encoding (see codecs.getDefaultEncoding()). Decoding errors raise a ValueError.
        Parameters:
        bytes - to be decoded
        Returns:
        Unicode string decoded from bytes
      • decode

        public static PyObject decode​(PyString bytes,
                                      PyString encoding)
        Decode bytes using the codec registered for the encoding. The encoding defaults to the system default encoding (see codecs.getDefaultEncoding()). Decoding errors raise a ValueError.
        Parameters:
        bytes - to be decoded
        encoding - name of encoding (to look up in codec registry)
        Returns:
        Unicode string decoded from bytes
      • decode

        public static PyObject decode​(PyString bytes,
                                      PyString encoding,
                                      PyString errors)
        Decode bytes using the codec registered for the encoding. The encoding defaults to the system default encoding (see codecs.getDefaultEncoding()). The string errors may name a different error handling policy (built-in or registered with register_error(String, PyObject) ). The default error policy is 'strict' meaning that decoding errors raise a ValueError.
        Parameters:
        bytes - to be decoded
        encoding - name of encoding (to look up in codec registry)
        errors - error policy name (e.g. "ignore")
        Returns:
        Unicode string decoded from bytes
      • encode

        public static PyString encode​(PyUnicode unicode)
        Encode unicode using the system default encoding (see codecs.getDefaultEncoding()). Encoding errors raise a ValueError.
        Parameters:
        unicode - string to be encoded
        Returns:
        bytes object encoding unicode
      • encode

        public static PyString encode​(PyUnicode unicode,
                                      PyString encoding)
        Encode unicode using the codec registered for the encoding. The encoding defaults to the system default encoding (see codecs.getDefaultEncoding()). Encoding errors raise a ValueError.
        Parameters:
        unicode - string to be encoded
        encoding - name of encoding (to look up in codec registry)
        Returns:
        bytes object encoding unicode
      • encode

        public static PyString encode​(PyUnicode unicode,
                                      PyString encoding,
                                      PyString errors)
        Encode unicode using the codec registered for the encoding. The encoding defaults to the system default encoding (see codecs.getDefaultEncoding()). The string errors may name a different error handling policy (built-in or registered with register_error(String, PyObject) ). The default error policy is 'strict' meaning that encoding errors raise a ValueError.
        Parameters:
        unicode - string to be encoded
        encoding - name of encoding (to look up in codec registry)
        errors - error policy name (e.g. "ignore")
        Returns:
        bytes object encoding unicode
      • utf_8_decode

        public static PyTuple utf_8_decode​(java.lang.String str)
      • utf_8_decode

        public static PyTuple utf_8_decode​(java.lang.String str,
                                           java.lang.String errors)
      • utf_8_decode

        public static PyTuple utf_8_decode​(java.lang.String str,
                                           java.lang.String errors,
                                           PyObject final_)
      • utf_8_decode

        public static PyTuple utf_8_decode​(java.lang.String str,
                                           java.lang.String errors,
                                           boolean final_)
      • utf_8_encode

        public static PyTuple utf_8_encode​(java.lang.String str)
      • utf_8_encode

        public static PyTuple utf_8_encode​(java.lang.String str,
                                           java.lang.String errors)
      • utf_7_decode

        public static PyTuple utf_7_decode​(java.lang.String bytes)
      • utf_7_decode

        public static PyTuple utf_7_decode​(java.lang.String bytes,
                                           java.lang.String errors)
      • utf_7_decode

        public static PyTuple utf_7_decode​(java.lang.String bytes,
                                           java.lang.String errors,
                                           boolean finalFlag)
      • utf_7_encode

        public static PyTuple utf_7_encode​(java.lang.String str)
      • utf_7_encode

        public static PyTuple utf_7_encode​(java.lang.String str,
                                           java.lang.String errors)
      • escape_decode

        public static PyTuple escape_decode​(java.lang.String str)
      • escape_decode

        public static PyTuple escape_decode​(java.lang.String str,
                                            java.lang.String errors)
      • escape_encode

        public static PyTuple escape_encode​(java.lang.String str)
      • escape_encode

        public static PyTuple escape_encode​(java.lang.String str,
                                            java.lang.String errors)
      • charmap_decode

        public static PyTuple charmap_decode​(java.lang.String bytes)
        Equivalent to charmap_decode(bytes, errors, null). This method is here so the error and mapping arguments can be optional at the Python level.
        Parameters:
        bytes - sequence of bytes to decode
        Returns:
        decoded string and number of bytes consumed
      • charmap_decode

        public static PyTuple charmap_decode​(java.lang.String bytes,
                                             java.lang.String errors)
        Equivalent to charmap_decode(bytes, errors, null). This method is here so the error argument can be optional at the Python level.
        Parameters:
        bytes - sequence of bytes to decode
        errors - error policy
        Returns:
        decoded string and number of bytes consumed
      • charmap_decode

        public static PyTuple charmap_decode​(java.lang.String bytes,
                                             java.lang.String errors,
                                             PyObject mapping)
        Decode a sequence of bytes into Unicode characters via a mapping supplied as a container to be indexed by the byte values (as unsigned integers). If the mapping is null or None, decode with latin-1 (essentially treating bytes as character codes directly).
        Parameters:
        bytes - sequence of bytes to decode
        errors - error policy
        mapping - to convert bytes to characters
        Returns:
        decoded string and number of bytes consumed
      • charmap_decode

        public static PyTuple charmap_decode​(java.lang.String bytes,
                                             java.lang.String errors,
                                             PyObject mapping,
                                             boolean ignoreUnmapped)
        Decode a sequence of bytes into Unicode characters via a mapping supplied as a container to be indexed by the byte values (as unsigned integers).
        Parameters:
        bytes - sequence of bytes to decode
        errors - error policy
        mapping - to convert bytes to characters
        ignoreUnmapped - if true, pass unmapped byte values as character codes [0..256)
        Returns:
        decoded string and number of bytes consumed
      • charmap_encode

        public static PyTuple charmap_encode​(java.lang.String str)
        Equivalent to charmap_encode(str, null, null). This method is here so the error and mapping arguments can be optional at the Python level.
        Parameters:
        str - to be encoded
        Returns:
        (encoded data, size(str)) as a pair
      • charmap_encode

        public static PyTuple charmap_encode​(java.lang.String str,
                                             java.lang.String errors)
        Equivalent to charmap_encode(str, errors, null). This method is here so the mapping can be optional at the Python level.
        Parameters:
        str - to be encoded
        errors - error policy name (e.g. "ignore")
        Returns:
        (encoded data, size(str)) as a pair
      • charmap_encode

        public static PyTuple charmap_encode​(java.lang.String str,
                                             java.lang.String errors,
                                             PyObject mapping)
        Encoder based on an optional character mapping. This mapping is either an EncodingMap of 256 entries, or an arbitrary container indexable with integers using __finditem__ and yielding byte strings. If the mapping is null, latin-1 (effectively a mapping of character code to the numerically-equal byte) is used
        Parameters:
        str - to be encoded
        errors - error policy name (e.g. "ignore")
        mapping - from character code to output byte (or string)
        Returns:
        (encoded data, size(str)) as a pair
      • ascii_decode

        public static PyTuple ascii_decode​(java.lang.String str)
      • ascii_decode

        public static PyTuple ascii_decode​(java.lang.String str,
                                           java.lang.String errors)
      • ascii_encode

        public static PyTuple ascii_encode​(java.lang.String str)
      • ascii_encode

        public static PyTuple ascii_encode​(java.lang.String str,
                                           java.lang.String errors)
      • latin_1_decode

        public static PyTuple latin_1_decode​(java.lang.String str)
      • latin_1_decode

        public static PyTuple latin_1_decode​(java.lang.String str,
                                             java.lang.String errors)
      • latin_1_encode

        public static PyTuple latin_1_encode​(java.lang.String str)
      • latin_1_encode

        public static PyTuple latin_1_encode​(java.lang.String str,
                                             java.lang.String errors)
      • utf_16_encode

        public static PyTuple utf_16_encode​(java.lang.String str)
      • utf_16_encode

        public static PyTuple utf_16_encode​(java.lang.String str,
                                            java.lang.String errors)
      • utf_16_encode

        public static PyTuple utf_16_encode​(java.lang.String str,
                                            java.lang.String errors,
                                            int byteorder)
      • utf_16_le_encode

        public static PyTuple utf_16_le_encode​(java.lang.String str)
      • utf_16_le_encode

        public static PyTuple utf_16_le_encode​(java.lang.String str,
                                               java.lang.String errors)
      • utf_16_be_encode

        public static PyTuple utf_16_be_encode​(java.lang.String str)
      • utf_16_be_encode

        public static PyTuple utf_16_be_encode​(java.lang.String str,
                                               java.lang.String errors)
      • encode_UTF16

        public static java.lang.String encode_UTF16​(java.lang.String str,
                                                    java.lang.String errors,
                                                    int byteorder)
      • utf_16_decode

        public static PyTuple utf_16_decode​(java.lang.String str)
      • utf_16_decode

        public static PyTuple utf_16_decode​(java.lang.String str,
                                            java.lang.String errors)
      • utf_16_decode

        public static PyTuple utf_16_decode​(java.lang.String str,
                                            java.lang.String errors,
                                            boolean final_)
      • utf_16_le_decode

        public static PyTuple utf_16_le_decode​(java.lang.String str)
      • utf_16_le_decode

        public static PyTuple utf_16_le_decode​(java.lang.String str,
                                               java.lang.String errors)
      • utf_16_le_decode

        public static PyTuple utf_16_le_decode​(java.lang.String str,
                                               java.lang.String errors,
                                               boolean final_)
      • utf_16_be_decode

        public static PyTuple utf_16_be_decode​(java.lang.String str)
      • utf_16_be_decode

        public static PyTuple utf_16_be_decode​(java.lang.String str,
                                               java.lang.String errors)
      • utf_16_be_decode

        public static PyTuple utf_16_be_decode​(java.lang.String str,
                                               java.lang.String errors,
                                               boolean final_)
      • utf_16_ex_decode

        public static PyTuple utf_16_ex_decode​(java.lang.String str)
      • utf_16_ex_decode

        public static PyTuple utf_16_ex_decode​(java.lang.String str,
                                               java.lang.String errors)
      • utf_16_ex_decode

        public static PyTuple utf_16_ex_decode​(java.lang.String str,
                                               java.lang.String errors,
                                               int byteorder)
      • utf_16_ex_decode

        public static PyTuple utf_16_ex_decode​(java.lang.String str,
                                               java.lang.String errors,
                                               int byteorder,
                                               boolean final_)
      • utf_32_encode

        public static PyTuple utf_32_encode​(java.lang.String unicode)
        Encode a Unicode Java String as UTF-32 with byte order mark. (Encoding is in platform byte order, which is big-endian for Java.)
        Parameters:
        unicode - to be encoded
        Returns:
        tuple (encoded_bytes, unicode_consumed)
      • utf_32_encode

        public static PyTuple utf_32_encode​(java.lang.String unicode,
                                            java.lang.String errors)
        Encode a Unicode Java String as UTF-32 with byte order mark. (Encoding is in platform byte order, which is big-endian for Java.)
        Parameters:
        unicode - to be encoded
        errors - error policy name or null meaning "strict"
        Returns:
        tuple (encoded_bytes, unicode_consumed)
      • utf_32_encode

        public static PyTuple utf_32_encode​(java.lang.String unicode,
                                            java.lang.String errors,
                                            int byteorder)
        Encode a Unicode Java String as UTF-32 in specified byte order with byte order mark.
        Parameters:
        unicode - to be encoded
        errors - error policy name or null meaning "strict"
        byteorder - decoding "endianness" specified (in the Python -1, 0, +1 convention)
        Returns:
        tuple (encoded_bytes, unicode_consumed)
      • utf_32_le_encode

        public static PyTuple utf_32_le_encode​(java.lang.String unicode)
        Encode a Unicode Java String as UTF-32 with little-endian byte order. No byte-order mark is generated.
        Parameters:
        unicode - to be encoded
        Returns:
        tuple (encoded_bytes, unicode_consumed)
      • utf_32_le_encode

        public static PyTuple utf_32_le_encode​(java.lang.String unicode,
                                               java.lang.String errors)
        Encode a Unicode Java String as UTF-32 with little-endian byte order. No byte-order mark is generated.
        Parameters:
        unicode - to be encoded
        errors - error policy name or null meaning "strict"
        Returns:
        tuple (encoded_bytes, unicode_consumed)
      • utf_32_be_encode

        public static PyTuple utf_32_be_encode​(java.lang.String unicode)
        Encode a Unicode Java String as UTF-32 with big-endian byte order. No byte-order mark is generated.
        Parameters:
        unicode - to be encoded
        Returns:
        tuple (encoded_bytes, unicode_consumed)
      • utf_32_be_encode

        public static PyTuple utf_32_be_encode​(java.lang.String unicode,
                                               java.lang.String errors)
        Encode a Unicode Java String as UTF-32 with big-endian byte order. No byte-order mark is generated.
        Parameters:
        unicode - to be encoded
        errors - error policy name or null meaning "strict"
        Returns:
        tuple (encoded_bytes, unicode_consumed)
      • utf_32_decode

        public static PyTuple utf_32_decode​(java.lang.String bytes)
        Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. The endianness used will have been deduced from a byte-order mark, if present, or will be big-endian (Java platform default). The unicode text is presented as a Java String (the UTF-16 representation used by PyUnicode). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.
        Parameters:
        bytes - to be decoded (Jython PyString convention)
        Returns:
        tuple (unicode_result, bytes_consumed)
      • utf_32_decode

        public static PyTuple utf_32_decode​(java.lang.String bytes,
                                            java.lang.String errors)
        Decode a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. The endianness used will have been deduced from a byte-order mark, if present, or will be big-endian (Java platform default). The unicode text is presented as a Java String (the UTF-16 representation used by PyUnicode). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.
        Parameters:
        bytes - to be decoded (Jython PyString convention)
        errors - error policy name (e.g. "ignore", "replace")
        Returns:
        tuple (unicode_result, bytes_consumed)
      • utf_32_decode

        public static PyTuple utf_32_decode​(java.lang.String bytes,
                                            java.lang.String errors,
                                            boolean isFinal)
        Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. The endianness used will have been deduced from a byte-order mark, if present, or will be big-endian (Java platform default). The unicode text is presented as a Java String (the UTF-16 representation used by PyUnicode).
        Parameters:
        bytes - to be decoded (Jython PyString convention)
        errors - error policy name (e.g. "ignore", "replace")
        isFinal - if a "final" call, meaning the input must all be consumed
        Returns:
        tuple (unicode_result, bytes_consumed)
      • utf_32_le_decode

        public static PyTuple utf_32_le_decode​(java.lang.String bytes)
        Decode a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used by PyUnicode). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.
        Parameters:
        bytes - to be decoded (Jython PyString convention)
        Returns:
        tuple (unicode_result, bytes_consumed)
      • utf_32_le_decode

        public static PyTuple utf_32_le_decode​(java.lang.String bytes,
                                               java.lang.String errors)
        Decode a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used by PyUnicode). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.
        Parameters:
        bytes - to be decoded (Jython PyString convention)
        errors - error policy name (e.g. "ignore", "replace")
        Returns:
        tuple (unicode_result, bytes_consumed)
      • utf_32_le_decode

        public static PyTuple utf_32_le_decode​(java.lang.String bytes,
                                               java.lang.String errors,
                                               boolean isFinal)
        Decode (perhaps partially) a sequence of bytes representing the UTF-32 little-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used by PyUnicode).
        Parameters:
        bytes - to be decoded (Jython PyString convention)
        errors - error policy name (e.g. "ignore", "replace")
        isFinal - if a "final" call, meaning the input must all be consumed
        Returns:
        tuple (unicode_result, bytes_consumed)
      • utf_32_be_decode

        public static PyTuple utf_32_be_decode​(java.lang.String bytes)
        Decode a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used by PyUnicode). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.
        Parameters:
        bytes - to be decoded (Jython PyString convention)
        Returns:
        tuple (unicode_result, bytes_consumed)
      • utf_32_be_decode

        public static PyTuple utf_32_be_decode​(java.lang.String bytes,
                                               java.lang.String errors)
        Decode a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. The unicode text is presented as a Java String (the UTF-16 representation used by PyUnicode). It is an error for the input bytes not to form a whole number of valid UTF-32 codes.
        Parameters:
        bytes - to be decoded (Jython PyString convention)
        errors - error policy name (e.g. "ignore", "replace")
        Returns:
        tuple (unicode_result, bytes_consumed)
      • utf_32_be_decode

        public static PyTuple utf_32_be_decode​(java.lang.String bytes,
                                               java.lang.String errors,
                                               boolean isFinal)
        Decode (perhaps partially) a sequence of bytes representing the UTF-32 big-endian encoded form of a Unicode string and return as a tuple the unicode text, and the amount of input consumed. A (correctly-oriented) byte-order mark will pass as a zero-width non-breaking space. Unicode string and return as a tuple the unicode text, the amount of input consumed. The unicode text is presented as a Java String (the UTF-16 representation used by PyUnicode).
        Parameters:
        bytes - to be decoded (Jython PyString convention)
        errors - error policy name (e.g. "ignore", "replace")
        isFinal - if a "final" call, meaning the input must all be consumed
        Returns:
        tuple (unicode_result, bytes_consumed)
      • utf_32_ex_decode

        public static PyTuple utf_32_ex_decode​(java.lang.String bytes,
                                               java.lang.String errors,
                                               int byteorder)
        Decode a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, the amount of input consumed, and the decoding "endianness" used (in the Python -1, 0, +1 convention). The endianness, if not unspecified (=0), will be deduced from a byte-order mark and returned. (This codec entrypoint is used in that way in the utf_32.py codec, but only until the byte order is known.) When not defined by a BOM, processing assumes big-endian coding (Java platform default), but returns "unspecified". (The utf_32.py codec treats this as an error, once more than 4 bytes have been processed.) (Java platform default). The unicode text is presented as a Java String (the UTF-16 representation used by PyUnicode).
        Parameters:
        bytes - to be decoded (Jython PyString convention)
        errors - error policy name (e.g. "ignore", "replace")
        byteorder - decoding "endianness" specified (in the Python -1, 0, +1 convention)
        Returns:
        tuple (unicode_result, bytes_consumed, endianness)
      • utf_32_ex_decode

        public static PyTuple utf_32_ex_decode​(java.lang.String bytes,
                                               java.lang.String errors,
                                               int byteorder,
                                               boolean isFinal)
        Decode (perhaps partially) a sequence of bytes representing the UTF-32 encoded form of a Unicode string and return as a tuple the unicode text, the amount of input consumed, and the decoding "endianness" used (in the Python -1, 0, +1 convention). The endianness will be that specified, will have been deduced from a byte-order mark, if present, or will be big-endian (Java platform default). Or it may still be undefined if fewer than 4 bytes are presented. (This codec entrypoint is used in the utf-32 codec only untile the byte order is known.) The unicode text is presented as a Java String (the UTF-16 representation used by PyUnicode).
        Parameters:
        bytes - to be decoded (Jython PyString convention)
        errors - error policy name (e.g. "ignore", "replace")
        byteorder - decoding "endianness" specified (in the Python -1, 0, +1 convention)
        isFinal - if a "final" call, meaning the input must all be consumed
        Returns:
        tuple (unicode_result, bytes_consumed, endianness)
      • raw_unicode_escape_encode

        public static PyTuple raw_unicode_escape_encode​(java.lang.String str)
      • raw_unicode_escape_encode

        public static PyTuple raw_unicode_escape_encode​(java.lang.String str,
                                                        java.lang.String errors)
      • raw_unicode_escape_decode

        public static PyTuple raw_unicode_escape_decode​(java.lang.String str)
      • raw_unicode_escape_decode

        public static PyTuple raw_unicode_escape_decode​(java.lang.String str,
                                                        java.lang.String errors)
      • unicode_escape_encode

        public static PyTuple unicode_escape_encode​(java.lang.String str)
      • unicode_escape_encode

        public static PyTuple unicode_escape_encode​(java.lang.String str,
                                                    java.lang.String errors)
      • unicode_escape_decode

        public static PyTuple unicode_escape_decode​(java.lang.String str)
      • unicode_escape_decode

        public static PyTuple unicode_escape_decode​(java.lang.String str,
                                                    java.lang.String errors)
      • unicode_internal_encode

        @Deprecated
        public static PyTuple unicode_internal_encode​(java.lang.String unicode)
        Deprecated.
        Legacy method to encode given unicode in CPython wide-build internal format (equivalent UTF-32BE).
      • unicode_internal_encode

        @Deprecated
        public static PyTuple unicode_internal_encode​(java.lang.String unicode,
                                                      java.lang.String errors)
        Deprecated.
        Legacy method to encode given unicode in CPython wide-build internal format (equivalent UTF-32BE). There must be a multiple of 4 bytes.
      • unicode_internal_decode

        @Deprecated
        public static PyTuple unicode_internal_decode​(java.lang.String bytes)
        Deprecated.
        Legacy method to decode given bytes as if CPython wide-build internal format (equivalent UTF-32BE). There must be a multiple of 4 bytes.
      • unicode_internal_decode

        @Deprecated
        public static PyTuple unicode_internal_decode​(java.lang.String bytes,
                                                      java.lang.String errors)
        Deprecated.
        Legacy method to decode given bytes as if CPython wide-build internal format (equivalent UTF-32BE). There must be a multiple of 4 bytes.