couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filipe David Manana <fdman...@gmail.com>
Subject Re: [jira] Commented: (COUCHDB-583) storing attachments in compressed form and serving them in compressed form if accepted by the client
Date Tue, 22 Dec 2009 14:05:52 GMT
:)

be aware, it's a patch a lot bigger and different than the one you checked
before :)

Honestly, I think storing compressed attachments is a plus. Allows for disk
space saving and faster disk reads. Noticeable specially for text format
attachments. E.g. a 10mb text file is compressed into less than 100K. Much
less disk space needed and a faster reading.

For web browser based apps (in Firefox at least) the XMLHttpReq sent by the
browser always as the accept-encoding header set to "gzip, deflate". It
doesn't allow the programmer to override the header's value, at least I
couldn't do it with Firefox 3.5. Therefore, couch doesn't need to decompress
the attachment while sending its chunks to the client. Decompression is done
by Firefox and transparent to the JS code :) A really nice speedup for
couch.

cheers

On Tue, Dec 22, 2009 at 2:46 PM, Paul Joseph Davis <
paul.joseph.davis@gmail.com> wrote:

> Too lazy to log into jira on my phone.
>
> I just responded to the Jura email I got this morning. I must've gotten
> confused.  I'll try and review your patch tonight or tomorrow and get back
> to you.
>
>
>
>
> On Dec 22, 2009, at 8:26 AM, "Filipe Manana (JIRA)" <jira@apache.org>
> wrote:
>
>
>>   [
>> https://issues.apache.org/jira/browse/COUCHDB-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793615#action_12793615
>>  ]
>>
>> Filipe Manana commented on COUCHDB-583:
>> ---------------------------------------
>>
>> Paul,
>>
>> you're comments refer to the first 3 patches' implementation. The 4th and
>> latest follow Damien's idea (comment from the 30th November).
>>
>> Check the last patch:      couchdb-583-trunk-6th-try.patch
>>
>> The approach is completely different. There's no use of the query
>> parameter ?compression=(gzip|deflate) and no longer that block buffering
>> thing for compression / decompression :) With the latest ones the attachment
>> are compressed and stored in compressed form (if their mime type matches one
>> of those in the config file).
>>
>> As soon as a data chunk is received from the client, it is compressed with
>> a zlib stream and written to disk. Decompression follows the same idea - 1
>> block is read from the disk, compressed and a chunk sent to the client. No
>> need to buffer things. I figured out how to use zlib for incremental gzip
>> compression/decompression.
>>
>> The "reshold_for_chunking_comp_responses" is completely gone also. HTTP
>> content-encoding is now negotiated.
>>
>> After analying the patch, let me know if the implementation is ok and how
>> to simplify it further.
>>
>> cheers
>>
>>
>>  storing attachments in compressed form and serving them in compressed
>>> form if accepted by the client
>>>
>>> ----------------------------------------------------------------------------------------------------
>>>
>>>               Key: COUCHDB-583
>>>               URL: https://issues.apache.org/jira/browse/COUCHDB-583
>>>           Project: CouchDB
>>>        Issue Type: New Feature
>>>        Components: Database Core, HTTP Interface
>>>       Environment: CouchDB trunk
>>>          Reporter: Filipe Manana
>>>       Attachments: couchdb-583-trunk-3rd-try.patch,
>>> couchdb-583-trunk-4th-try-trunk.patch, couchdb-583-trunk-5th-try.patch,
>>> couchdb-583-trunk-6th-try.patch, jira-couchdb-583-1st-try-trunk.patch,
>>> jira-couchdb-583-2nd-try-trunk.patch
>>>
>>>
>>> This feature allows Couch to gzip compress attachments as they are being
>>> received and store them in compressed form.
>>> When a client asks for downloading an attachment (e.g. GET
>>> somedb/somedoc/attachment.txt), the attachment is sent in compressed form if
>>> the client's http request has gzip specified as a valid transfer encoding
>>> for the response (using the http header "Accept-Encoding"). Otherwise couch
>>> decompresses the attachment before sending it back to the client.
>>> Attachments are compressed only if their MIME type matches one of those
>>> listed in a separate config file. Compression level is also configurable in
>>> the default.ini file.
>>> This follows Damien's suggestion from 30 November:
>>> "Perhaps we need a separate user editable ini file to specify
>>> compressable or non-compressable files (would probably be too big for the
>>> regular ini file). What do other web servers do?
>>> Also, a potential optimization is to compress the file while writing to
>>> disk, and serve the compressed bytes directly to clients that can handle it,
>>> and decompressed for those that can't. For compressable types, it's a win
>>> for both disk IO for reads and writes, and CPU on read."
>>> Patch attached.
>>>
>>
>> --
>> This message is automatically generated by JIRA.
>> -
>> You can reply to this email to add a comment to the issue online.
>>
>>


-- 
Filipe David Manana,
fdmanana@gmail.com
PGP key - http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xC569452B

"Reasonable men adapt themselves to the world.
Unreasonable men adapt the world to themselves.
That's why all progress depends on unreasonable men."

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message