commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 9931] - Base64 decoder chokes on a whitespace: FASTER?
Date Mon, 27 Jan 2003 09:55:21 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=9931>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=9931

Base64 decoder chokes on a whitespace: FASTER?





------- Additional Comments From m.redington@ucl.ac.uk  2003-01-27 09:55 -------

I've got some timing info for the small base64 test case, and for a largish
(864k) data file.

The inline method was consistently superior in both cases. Given this, I'm not
sure that slowing up the tests warrants a large test case, or bulking up the
tree, although it might be nice to keep around somehow ... I'm still working how
to run single junit testcases conveniently from the command line (if you do want
large tests, I do have some code for this).

Typical figures for the large sample on my machine are: 125/147/517 (ms.,
min/mean/max) versus 124/237/601 for the old method (average of 1000 calls to
encode). These are the *least* favourable figures (fastest old/slowest new) from
four runs of each. Typically inlining seemed to almost halve the speed. With the
small sample, .015 vs .019 was average (total ms. for 1000 calls to encode).

A few other points:

1) my patch doesn't terminate the output in a newline, unless it ends on a chunk
boundary by chance. The perl base64 encoder seems to do this, and the RFC
implies it, so it probably is the way to go. It should be easy, but make sure
you don't add two newlines by mistake. Let me know if you'd like a patch for this.

BTW, Base64Test.assertEquals(byte[] a, byte[] b) seems to be broken - it only
compares the arrays up to a.length, and doesn't check that they have the same size.

2) decode is *slow*. In my timings, decoding the large file took three or four
times as long as encoding, probably due to the Vectors. It should be a lot
faster to brute force copy all non-whitespace chars into a new array, and then
resize this at the end, if inlining is too tricky.

I nearly jumped at this one, but then I searched for Base64 on search.apache.org

There seem to be dozens instances of this code out there, a few of which may
have inlined whitespace stripping (e.g. in tomcat, maybe), but most of which
seem to have come from the same original codebase. Isn't this exactly what 
commons is for?

I guess that leaves the question, should xmlrpc patch and pass stuff up to
commons, or should xmlrpc start using the commons classes.

--
To unsubscribe, e-mail:   <mailto:commons-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:commons-dev-help@jakarta.apache.org>


Mime
View raw message