jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Ryan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-8013) [Direct Binary Access] DataRecordDownloadOptions creates invalid Content-Disposition headers
Date Fri, 01 Feb 2019 03:32:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-8013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757930#comment-16757930

Matt Ryan commented on OAK-8013:

It appears that this will unfortunately be much more involved than you would think on initial

We have a [test|https://github.com/apache/jackrabbit-oak/blob/7cceae21eb22f52fc09cc622402b396953268998/oak-blob-plugins/src/test/java/org/apache/jackrabbit/oak/plugins/blob/datastore/directaccess/AbstractDataRecordAccessProviderTest.java#L144]
we can extend to verify this by simply modifying the [validation|https://github.com/apache/jackrabbit-oak/blob/7cceae21eb22f52fc09cc622402b396953268998/oak-blob-plugins/src/test/java/org/apache/jackrabbit/oak/plugins/blob/datastore/directaccess/AbstractDataRecordAccessProviderTest.java#L171]
of the Content-Disposition header in that test.

So that's where I started, and fixing this in S3DataStore was pretty easy. We are already
using a filename in the test of "album cover.png", so if that was encoded properly in the
header it should appear like this:
inline; filename="album cover.png"; filename*=UTF-8''album%20cover.png{noformat}

With that change made, the signed URI test for S3 works as expected and the enhanced validation
shows that the encoded filename is included in the Content-Disposition header in the response
as expected.

For Azure, it's not so simple.

Azure appears to automatically encode the entire URI string, and it appears to do this AFTER
the signing happens but BEFORE returning the signed URI.  The behavior I saw was this:

* If I leave the code as it was before, without encoding the filename* part of the requested
content disposition, all the tests work as before (we expected this).
* If I add the encoding for the filename* part of the requested content disposition, the test
now fails with a 403 error.  Examining the response from Azure informs that the signature
does not match.

If you look at the signed URI, you can see that they have encoded the filename* part again,
so now it looks like:

If you manually change that portion of the URI back to "album%20cover.png" the request succeeds
with a 200 - but with the value in the Content-Disposition header formatted incorrectly (with
a space in the filename* part instead of being encoded).

We aren't in the code path once the signed URI is returned, so we can't fix the Content-Disposition
header any other way.  So far I haven't found a way to tell Azure's API to not encode that
value twice.  This appears to be a bug in Azure's API.

What *does* work is a bit of a hack.  If you encode the filename* portion TWICE, then sign
the URI, then find that portion in the signed URI and DECODE the filename* portion, this seems
to work.  Pretty ugly.

Another way to address this would be to simply not include the filename* portion of the Content-Disposition
header for Azure.  This would mean that filenames with special encodings would not be supported

> [Direct Binary Access] DataRecordDownloadOptions creates invalid Content-Disposition
> --------------------------------------------------------------------------------------------
>                 Key: OAK-8013
>                 URL: https://issues.apache.org/jira/browse/OAK-8013
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: blob-plugins
>            Reporter: Alexander Klimetschek
>            Assignee: Matt Ryan
>            Priority: Major
> DataRecordDownloadOptions always adds the extended parameter filename* to the header,
[without any escaping|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/datastore/directaccess/DataRecordDownloadOptions.java#L130].
> Such extended parameters must not include spaces (and only a small predefined list of
basic ascii chars), otherwise they have to be percent encoded. The RFC is https://tools.ietf.org/html/rfc5987
and note the definition of value-chars in the grammar.
> Because of this, if a filename includes a space or another character that must be percent
encoded, this currently creates an invalid header that fails to be parsed by other clients.
> See also https://github.com/jshttp/content-disposition/issues/24

This message was sent by Atlassian JIRA

View raw message