commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joerg Schaible (JIRA)" <>
Subject [jira] Commented: (CODEC-58) Character set used by Base64 not documented
Date Fri, 16 Nov 2007 13:07:44 GMT


Joerg Schaible commented on CODEC-58:

The RFC 4648 talks about "The encoding process represents 24-bit groups of input bits as output
strings of 4 encoded characters". This implies that each of this "characters" represent 8
bit. Obviously this does not match Java's representation of a "char". Therefore the method
returns bytes.

The documentation mentions the character set is the one used in MIME encoding (RFC 2045, p25),
other character sets are not supported. Nevertheless the character set consists of characters
that can be representated within 8 bits. So the documentation could state that the result
of a Base64 encoding is a byte stream that represents a String with ASCII encoding.

> Character set used by Base64 not documented
> -------------------------------------------
>                 Key: CODEC-58
>                 URL:
>             Project: Commons Codec
>          Issue Type: Bug
>    Affects Versions: 1.1, 1.2, 1.3, 1.4
>            Reporter: Pepijn Schmitz
>            Priority: Minor
> The Javadoc for the Base64 class does not document which character set is returned by
encode() and expected by decode(). The RFC specifies "characters", not "bytes" as the result
of the encoding, and yet Base64 returns bytes. It should provide complete information as to
how to convert these bytes to and from Strings. I assume the character set used is ASCII,
but that should be made explicit in the Javadoc.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message