Return-Path: Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: (qmail 72445 invoked from network); 22 Dec 2009 10:08:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 Dec 2009 10:08:23 -0000 Received: (qmail 68594 invoked by uid 500); 22 Dec 2009 10:06:43 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 67153 invoked by uid 500); 22 Dec 2009 10:06:38 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 54984 invoked by uid 99); 22 Dec 2009 09:52:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2009 09:52:53 +0000 X-ASF-Spam-Status: No, hits=-1998.5 required=10.0 tests=ALL_TRUSTED,WEIRD_PORT X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Dec 2009 09:52:51 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 69D95234C045 for ; Tue, 22 Dec 2009 01:52:29 -0800 (PST) Message-ID: <938930242.1261475549419.JavaMail.jira@brutus> Date: Tue, 22 Dec 2009 09:52:29 +0000 (UTC) From: "Filipe Manana (JIRA)" To: dev@couchdb.apache.org Subject: [jira] Commented: (COUCHDB-583) adding ?compression=(gzip|deflate) optional parameter to the attachment download API In-Reply-To: <844257352.1259509400633.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/COUCHDB-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793569#action_12793569 ] Filipe Manana commented on COUCHDB-583: --------------------------------------- Hi Adam, I see. I am just thinking about how to do it without causing backward incompatibility. Currently couch_file:append_term[_md5]/2 calls term_to_binary and precedes it with a 32 bits header and an optional md5 digest. Then the 32bits header + optional term md5 + term_to_binary(Term) is appended to the end of the DB file. The high order bit of this header indicates whether an md5 hash follows the header (and preceding the serialized term). Without looking deeply into the code, I think about adding an extra bit to the header which indicates if the term is compressed or not. Of course this implies adding a new DB header version value, etc. Not so straightforward as attachment compression. An (ugly) alternative I see is not adding a new header bit and when reading a serialized term from the DB file, always gunzip it and catch an exception: 3> catch(zlib:gunzip(<<"hello world">>)). {'EXIT',{data_error,[{zlib,call,3}, {zlib,inflate,2}, {zlib,gunzip,1}, {erl_eval,do_apply,5}, {erl_eval,expr,5}, {shell,exprs,6}, {shell,eval_exprs,6}, {shell,eval_loop,3}]}} The issue is that the data_error exception might not mean that the data is not gzip compressed. If using an extra header bit, than why not add a few more bits that will be reserved for future features. A little bit like most protocol RFCs do, they reserve a few bits in an header for future usage :) What's your opinion? cheers > adding ?compression=(gzip|deflate) optional parameter to the attachment download API > ------------------------------------------------------------------------------------ > > Key: COUCHDB-583 > URL: https://issues.apache.org/jira/browse/COUCHDB-583 > Project: CouchDB > Issue Type: New Feature > Components: HTTP Interface > Environment: CouchDB trunk revision 885240 > Reporter: Filipe Manana > Attachments: couchdb-583-trunk-3rd-try.patch, couchdb-583-trunk-4th-try-trunk.patch, couchdb-583-trunk-5th-try.patch, couchdb-583-trunk-6th-try.patch, jira-couchdb-583-1st-try-trunk.patch, jira-couchdb-583-2nd-try-trunk.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > The following new feature is added in the patch following this ticket creation. > A new optional http query parameter "compression" is added to the attachments API. > This parameter can have one of the values: "gzip" or "deflate". > When asking for an attachment (GET http request), if the query parameter "compression" is found, CouchDB will send the attachment compressed to the client (and sets the header Content-Encoding with gzip or deflate). > Further, it adds a new config option "treshold_for_chunking_comp_responses" (httpd section) that specifies an attachment length threshold. If an attachment has a length >= than this threshold, the http response will be chunked (besides compressed). > Note that using non chunked compressed body responses requires storing all the compressed blocks in memory and then sending each one to the client. This is a necessary "evil", as we only know the length of the compressed body after compressing all the body, and we need to set the "Content-Length" header for non chunked responses. By sending chunked responses, we can send each compressed block immediately, without accumulating all of them in memory. > Examples: > $ curl http://localhost:5984/testdb/testdoc1/readme.txt?compression=gzip > $ curl http://localhost:5984/testdb/testdoc1/readme.txt?compression=deflate > $ curl http://localhost:5984/testdb/testdoc1/readme.txt # attachment will not be compressed > $ curl http://localhost:5984/testdb/testdoc1/readme.txt?compression=rar # will give a 500 error code > Etap test case included. > Feedback would be very welcome. > cheers -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.