hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HTTPCLIENT-1372) Content-Disposition header in form data does not adhere to RFC6266
Date Mon, 17 Jun 2013 08:14:20 GMT

    [ https://issues.apache.org/jira/browse/HTTPCLIENT-1372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13685084#comment-13685084
] 

Karl Wright commented on HTTPCLIENT-1372:
-----------------------------------------

The class HttpMultipart.java has the following logic for multipart form parts:

{code}
                MinimalField cd = part.getHeader().getField(MIME.CONTENT_DISPOSITION);
                writeField(cd, this.charset, out);
                String filename = part.getBody().getFilename();
                if (filename != null) {
                    MinimalField ct = part.getHeader().getField(MIME.CONTENT_TYPE);
                    writeField(ct, this.charset, out);
                }
{code}

This means that *only* in the form sections that have a filename set will there be a content-type
set.  Unfortunately, that means that while by using COMPATIBLE mode I can get the filename
itself correctly decoded, I lose the ability to get the rest of the form correctly decoded.
 I'm not convinced this is intentional behavior, either.

Since this code is in HttpMultipart, I cannot see any way of overriding this behavior in 4.2.x
other than by overriding all the public methods of this class in a new ModifiedMultiPartEntity
class that basically does everything for form processing.  Oleg, do see any simpler way?

                
> Content-Disposition header in form data does not adhere to RFC6266
> ------------------------------------------------------------------
>
>                 Key: HTTPCLIENT-1372
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1372
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpMime
>    Affects Versions: 4.2.5
>            Reporter: Karl Wright
>             Fix For: 4.3 Beta3
>
>
> The Content-disposition header, as it appears for an item of form data, does not allow
for UTF-8 encoding as specified in RFC6266, as described here:
> http://tools.ietf.org/html/rfc6266
> This is causing ManifoldCF severe problems working in Japan with Solr, since Solr content
extraction relies on accurate filenames in order to determine the likely document encoding.
> A fix for the 4.2.x branch will be needed, I am afraid.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Mime
View raw message