commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keegan Witt (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (CODEC-89) new Base64().encode() appends a CRLF, and chunks results into 76 character lines
Date Mon, 08 Mar 2010 13:55:27 GMT

    [ https://issues.apache.org/jira/browse/CODEC-89?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842656#action_12842656
] 

Keegan Witt edited comment on CODEC-89 at 3/8/10 1:55 PM:
----------------------------------------------------------

This is also present for non-empty constructors, which the patch doesn't seem to address.
 I think this is what CODEC-94 was trying to say.  If you construct a Base64, specifying lineSeparator
and/or lineLength, this is ignored with you do the encoding.  For example,

System.out.println("&#92;&#92;n: " + "&#92;n".getBytes());
System.out.println("&#92;&#92;r&#92;&#92;n: " + "&#92;r&#92;n".getBytes());
String unencodedString = "aaaaaaaaaaaaaaaa";
Base64 encoder = new Base64(4, "\n".getBytes());
String encodedString = new String( encoder.encodeBase64( unencodedString .getBytes(), true
) );
System.out.println(encodedString);
System.out.println(encodedString.getBytes());

You'll note the byte sequence ends with [13, 10] instead of [10], and the line didn't get
split at 4 as specified in the constructor, but by 76.  You can double check this using a
larger string to encode.

I believe at least some of the blame lies with line 817 of Base64.java:
Base64 b64 = isChunked ? new Base64(urlSafe) : new Base64(0, CHUNK_SEPARATOR, urlSafe);
If I request the encoding to be chunked, it will call the constructor public Base64(boolean
urlSafe), which does
this(CHUNK_SIZE, CHUNK_SEPARATOR, urlSafe); (line 244).  As you can see, this overrides any
parameters passed into the constructor.

Or am I misunderstanding how this is supposed to work?

      was (Author: keegan):
    This is also present for non-empty constructors, which the patch doesn't seem to address.
 I think this is what CODEC-94 was trying to say.  If you construct a Base64, specifying lineSeparator
and/or lineLength, this is ignored with you do the encoding.  For example,

System.out.println("\ \n: " + "\n".getBytes());     // (note the space was needed because
Jira was eating my \\ )
System.out.println("\\r\ \n: " + "\r\n".getBytes());
String unencodedString = "aaaaaaaaaaaaaaaa";
Base64 encoder = new Base64(4, "\n".getBytes());
String encodedString = new String( encoder.encodeBase64( unencodedString .getBytes(), true
) );
System.out.println(encodedString);
System.out.println(encodedString.getBytes());

You'll note the byte sequence ends with [13, 10] instead of [10], and the line didn't get
split at 4 as specified in the constructor, but by 76.  You can double check this using a
larger string to encode.

I believe at least some of the blame lies with line 817 of Base64.java:
Base64 b64 = isChunked ? new Base64(urlSafe) : new Base64(0, CHUNK_SEPARATOR, urlSafe);
If I request the encoding to be chunked, it will call the constructor public Base64(boolean
urlSafe), which does
this(CHUNK_SIZE, CHUNK_SEPARATOR, urlSafe); (line 244).  As you can see, this overrides any
parameters passed into the constructor.

Or am I misunderstanding how this is supposed to work?
  
> new Base64().encode() appends a CRLF, and chunks results into 76 character lines
> --------------------------------------------------------------------------------
>
>                 Key: CODEC-89
>                 URL: https://issues.apache.org/jira/browse/CODEC-89
>             Project: Commons Codec
>          Issue Type: Bug
>    Affects Versions: 1.4
>            Reporter: Julius Davies
>         Attachments: Base64.patch, codec-89.patch
>
>
> The instance encode() method (e.g. new Base64().encode()) appends a CRLF.  Actually it's
fully chunking the output into 76 character lines.  Commons-Codec-1.3 did not do this.  The
static Base64.encodeBase64() method behaves the same in both 1.3 and 1.4, so this problem
only affects the instance encode() method.
> {code}
> import org.apache.commons.codec.binary.*;
> public class B64 {
>   public static void main(String[] args) throws Exception {
>     Base64 b64 = new Base64();
>     String s1 = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
>     String s2 = "aaaaaaaaaa";
>     String s3 = "a";
>     
>     byte[] b1 = s1.getBytes("UTF-8");
>     byte[] b2 = s2.getBytes("UTF-8");
>     byte[] b3 = s3.getBytes("UTF-8");
>     byte[] result;
>     result = Base64.encodeBase64(b1);
>     System.out.println("[" + new String(result, "UTF-8") + "]");
>     result = b64.encode(b1);
>     System.out.println("[" + new String(result, "UTF-8") + "]");
>     result = Base64.encodeBase64(b2);
>     System.out.println("[" + new String(result, "UTF-8") + "]");
>     result = b64.encode(b2);
>     System.out.println("[" + new String(result, "UTF-8") + "]");
>     result = Base64.encodeBase64(b3);
>     System.out.println("[" + new String(result, "UTF-8") + "]");
>     result = b64.encode(b3);
>     System.out.println("[" + new String(result, "UTF-8") + "]");
>   }
> }
> {code}
> Here's my output:
> {noformat}
> $ java -cp commons-codec-1.3.jar:. B64
> [YWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYQ==]
> [YWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYQ==]
> [YWFhYWFhYWFhYQ==]
> [YWFhYWFhYWFhYQ==]
> [YQ==]
> [YQ==]
> $ java -cp commons-codec-1.4.jar:. B64
> [YWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYQ==]
> [YWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFh
> YWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYQ==
> ]
> [YWFhYWFhYWFhYQ==]
> [YWFhYWFhYWFhYQ==
> ]
> [YQ==]
> [YQ==
> ]
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message