commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "A. Rothman" <>
Subject [FileUpload] various issues - encoding, RFC compliance, and tweaks
Date Thu, 18 May 2006 11:22:48 GMT


I'm considering moving to FileUpload for uploaded files handling.
I've gone over the code, and found various issues (RFC compliance or 
just little implementation tweaks) I figured I'd mention here before 
opening bugs:

1. according to the RFC (1522), non-ASCII headers use word encoding 
(=?...?= syntax). I didn't find this implemented in FileUpload 
(MultipartStream ?)
2. FileUploadBase.parseHeaders() does not handle header folding (also in 
3. FileUploadBase.parseHeaders() calls header.indexOf(':') 3 times, it 
can call it once and save the value (each call iterates over the string 
characters again).
4. where does the 1024 byte max header size limit come from (RFCs or 
just reasonable value)?
5. content encoding is not respected as defined in RFC - if a request 
encoding (charset) is specified, it should be used in parsing all form 
values. Currently each FileItem value must be retrieved with the 
explicit encoding (which is taken from the request). I've seen this 
reported also within other apache projects 

- the comment stands out).

6. further, the charset does seems to be used in parsing the headers - 
isn't this non-RFC behavior? from what I understand, anything that's 
non-ASCII within the headers themselves should be word-encoded (see 
issue #1), and the content-type charset should be used on the content, 
not the headers...

7. MultipartStream.readHeaders() - uses a one-byte array instead of 
single byte, for no apparent reason.

Please let me know which should have bugs opened for, and/or point out 
what I've misunderstood :-)



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message