commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julius Davies (JIRA)" <>
Subject [jira] Commented: (CODEC-58) Character set used by Base64 not documented
Date Thu, 16 Jul 2009 18:56:14 GMT


Julius Davies commented on CODEC-58:

org.apache.commons.codec.binary.Base64 is hard-coded to use lower 127 ASCII (== UTF-8):

    private static final byte[] STANDARD_ENCODE_TABLE = {
            'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
            'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',
            'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
            'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
            '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/'

I'll create a patch to put this in the javadoc.

> Character set used by Base64 not documented
> -------------------------------------------
>                 Key: CODEC-58
>                 URL:
>             Project: Commons Codec
>          Issue Type: Bug
>    Affects Versions: 1.1, 1.2, 1.3, 1.4
>            Reporter: Pepijn Schmitz
>            Priority: Minor
>             Fix For: 1.x
> The Javadoc for the Base64 class does not document which character set is returned by
encode() and expected by decode(). The RFC specifies "characters", not "bytes" as the result
of the encoding, and yet Base64 returns bytes. It should provide complete information as to
how to convert these bytes to and from Strings. I assume the character set used is ASCII,
but that should be made explicit in the Javadoc.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message