couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Filipe Manana (JIRA)" <>
Subject [jira] Commented: (COUCHDB-583) adding ?compression=(gzip|deflate) optional parameter to the attachment download API
Date Tue, 22 Dec 2009 09:52:29 GMT


Filipe Manana commented on COUCHDB-583:

Hi Adam,

I see. I am just thinking about how to do it without causing backward incompatibility.

Currently couch_file:append_term[_md5]/2 calls term_to_binary and precedes it with a 32 bits
header and an optional md5 digest. Then the 32bits header + optional term md5 + term_to_binary(Term)
is appended to the end of the DB file. 

The high order bit of this header indicates whether an md5 hash follows the header (and preceding
the serialized term).
Without looking deeply into the code, I think about adding an extra bit to the header which
indicates if the term is compressed or not.

Of course this implies adding a new DB header version value, etc. Not so straightforward as
attachment compression.

An (ugly) alternative I see is not adding a new header bit and when reading a serialized term
from the DB file, always gunzip it and catch an exception:

3> catch(zlib:gunzip(<<"hello world">>)).

The issue is that the data_error exception might not mean that the data is not gzip compressed.

If using an extra header bit, than why not add a few more bits that will be reserved for future
features. A little bit like most protocol RFCs do, they reserve a few bits in an header for
future usage :)

What's your opinion?


> adding ?compression=(gzip|deflate) optional parameter to the attachment download API
> ------------------------------------------------------------------------------------
>                 Key: COUCHDB-583
>                 URL:
>             Project: CouchDB
>          Issue Type: New Feature
>          Components: HTTP Interface
>         Environment: CouchDB trunk revision 885240
>            Reporter: Filipe Manana
>         Attachments: couchdb-583-trunk-3rd-try.patch, couchdb-583-trunk-4th-try-trunk.patch,
couchdb-583-trunk-5th-try.patch, couchdb-583-trunk-6th-try.patch, jira-couchdb-583-1st-try-trunk.patch,
>   Original Estimate: 24h
>  Remaining Estimate: 24h
> The following new feature is added in the patch following this ticket creation.
> A new optional http query parameter "compression" is added to the attachments API.
> This parameter can have one of the values:  "gzip" or "deflate".
> When asking for an attachment (GET http request), if the query parameter "compression"
is found, CouchDB will send the attachment compressed to the client (and sets the header Content-Encoding
with gzip or deflate).
> Further, it adds a new config option "treshold_for_chunking_comp_responses" (httpd section)
that specifies an attachment length threshold. If an attachment has a length >= than this
threshold, the http response will be chunked (besides compressed).
> Note that using non chunked compressed  body responses requires storing all the compressed
blocks in memory and then sending each one to the client. This is a necessary "evil", as we
only know the length of the compressed body after compressing all the body, and we need to
set the "Content-Length" header for non chunked responses. By sending chunked responses, we
can send each compressed block immediately, without accumulating all of them in memory.
> Examples:
> $ curl http://localhost:5984/testdb/testdoc1/readme.txt?compression=gzip
> $ curl http://localhost:5984/testdb/testdoc1/readme.txt?compression=deflate
> $ curl http://localhost:5984/testdb/testdoc1/readme.txt   # attachment will not be compressed
> $ curl http://localhost:5984/testdb/testdoc1/readme.txt?compression=rar   # will give
a 500 error code
> Etap test case included.
> Feedback would be very welcome.
> cheers

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message