couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <robert.new...@gmail.com>
Subject Re: [jira] Commented: (COUCHDB-583) storing attachments in compressed form and serving them in compressed form if accepted by the client
Date Tue, 26 Jan 2010 14:31:24 GMT
Yes, that's one approach. However, I also need to do something to make
old data unrecoverable, so I was thinking of changing the keys at
compaction time, rendering the old data unreadable. My particular use
case is niche, I admit.

On Tue, Jan 26, 2010 at 2:28 PM, Paul Davis <paul.joseph.davis@gmail.com> wrote:
> Robert,
>
> Is that not something you could do with an encrypted filesystem? I'm
> not too familiar with such things so I'm not certain if that carries
> its own drawbacks or what not.
>
> Paul
>
> On Tue, Jan 26, 2010 at 9:26 AM, Robert Newson <robert.newson@gmail.com> wrote:
>> The basic intention of my patch would be to ensure that unencrypted
>> documents and attachments are not on disk. You're right that there are
>> many other questions to answer for a general encryption feature.
>>
>> For my exact case, which might always remain a local patch, I wouldn't
>> mind the data passing unencrypted to and from the boxes, but should be
>> encrypted while "at rest" on disk.
>>
>> B.
>>
>> On Tue, Jan 26, 2010 at 2:02 PM, Filipe David Manana <fdmanana@gmail.com> wrote:
>>> Robert,
>>>
>>> I think your plans are very interesting, as they present not only
>>> interesting challenges but the feature itself I find useful also.
>>> Questions such as:
>>> - what kind of encryption? (symmetric, asymmetric, or both)
>>> - where are the keys stored? Or for the symmetric case, we would use a
>>> diffie-helman protocol for e.g.?
>>> - is the objective to have privacy at the DB storage level or also at the
>>> network level? (and force the decryption on the client side only)
>>> there are many more details of course.
>>> I do have the same opinion as you, the code would affect many of the parts
>>> regarding the compression (specially couch_stream). For doc compression, I
>>> imagine it would touch more places, and also present some difficulties to
>>> assure compatibility with the previous DB file formats.
>>> Let me know if somehow I can help you.
>>> cheers
>>>
>>> On Tue, Jan 26, 2010 at 2:51 PM, Robert Newson <robert.newson@gmail.com>
>>> wrote:
>>>>
>>>> that was my intention, but the option to send the encrypted bytes (for
>>>> decryption at the client end) is intriguing and also echoes the choice
>>>> to send compressed vs uncompressed responses.
>>>>
>>>> I don't mean to hold up this work and I doubt I'll have a patch any
>>>> time soon, it just seems that these two features have significant
>>>> overlap (you can send data in with a transformation applied or not,
>>>> and request it with or without that transformation).
>>>>
>>>> My brief look at the related code led me to believe that adding
>>>> encryption support would touch several places, and I would think that
>>>> most, perhaps all, of those places would also be touched by
>>>> compression support.
>>>>
>>>> Sorry to be vague, I only intended to add another perspective to the
>>>> discussion.
>>>>
>>>> On Tue, Jan 26, 2010 at 11:01 AM, Filipe David Manana
>>>> <fdmanana@gmail.com> wrote:
>>>> > Hi Robert,
>>>> >
>>>> > That's interesting.
>>>> > I think that abstraction is doable, but maybe not trivial.
>>>> >
>>>> > In your idea, you plan to always decrypt the docs/attachments before
>>>> > sending
>>>> > them to the client?
>>>> >
>>>> >
>>>> > On Tue, Jan 26, 2010 at 11:34 AM, Robert Newson
>>>> > <robert.newson@gmail.com>wrote:
>>>> >
>>>> >> fyi: I have a (much delayed) plan to work up an encryption patch
for
>>>> >> documents and attachments. Since encryption and compression both
apply
>>>> >> at the same level (and the order of the two is important), I wonder
if
>>>> >> that would change the approach taken here? That is, would an
>>>> >> abstraction that allowed a chain of transformations when storing
(and
>>>> >> the reverse chain when retrieving) be in order?
>>>> >>
>>>> >> B.
>>>> >>
>>>> >> On Tue, Jan 26, 2010 at 8:02 AM, Filipe Manana (JIRA) <jira@apache.org>
>>>> >> wrote:
>>>> >> >
>>>> >> >    [
>>>> >>
>>>> >> https://issues.apache.org/jira/browse/COUCHDB-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12804946#action_12804946]
>>>> >> >
>>>> >> > Filipe Manana commented on COUCHDB-583:
>>>> >> > ---------------------------------------
>>>> >> >
>>>> >> > @Chris
>>>> >> >
>>>> >> > Good point, I totally agree.
>>>> >> > It would be interesting to test with real couchapps, real data
and
>>>> >> > see
>>>> >> how worth it really is.
>>>> >> >
>>>> >> > A 10Mb text file, for instance, was compressed to about 100Kb
in one
>>>> >> > of
>>>> >> my tests.
>>>> >> >
>>>> >> > Also, as for the minified JavaScript files for example, it's
still
>>>> >> > worth
>>>> >> compressing them. For example, the minified Ext JS lib file (
>>>> >> http://www.extjs.com,  ext-all.js) is about 630Kb big. Compressed
with
>>>> >> gzip stays at about 170Kb, therefore a reasonably good size reduction.
>>>> >> >
>>>> >> > As Damien said in a previous comment, not only saves disk space
but
>>>> >> > also
>>>> >> reduces disk IO (attachment download requests, compaction).
>>>> >> >
>>>> >> > I also look forward to see the impact on real, production level,
>>>> >> applications.
>>>> >> >
>>>> >> >> storing attachments in compressed form and serving them
in
>>>> >> >> compressed
>>>> >> form if accepted by the client
>>>> >> >>
>>>> >>
>>>> >> ----------------------------------------------------------------------------------------------------
>>>> >> >>
>>>> >> >>                 Key: COUCHDB-583
>>>> >> >>                 URL:
>>>> >> >> https://issues.apache.org/jira/browse/COUCHDB-583
>>>> >> >>             Project: CouchDB
>>>> >> >>          Issue Type: New Feature
>>>> >> >>          Components: Database Core, HTTP Interface
>>>> >> >>         Environment: CouchDB trunk
>>>> >> >>            Reporter: Filipe Manana
>>>> >> >>         Attachments: couchdb-583-trunk-10th-try.patch,
>>>> >> couchdb-583-trunk-11th-try.patch, couchdb-583-trunk-12th-try.patch,
>>>> >> couchdb-583-trunk-13th-try.patch, couchdb-583-trunk-14th-try-git.patch,
>>>> >> couchdb-583-trunk-15th-try-git.patch, couchdb-583-trunk-3rd-try.patch,
>>>> >> couchdb-583-trunk-4th-try-trunk.patch, couchdb-583-trunk-5th-try.patch,
>>>> >> couchdb-583-trunk-6th-try.patch, couchdb-583-trunk-7th-try.patch,
>>>> >> couchdb-583-trunk-8th-try.patch, couchdb-583-trunk-9th-try.patch,
>>>> >> jira-couchdb-583-1st-try-trunk.patch,
>>>> >> jira-couchdb-583-2nd-try-trunk.patch
>>>> >> >>
>>>> >> >>
>>>> >> >> This feature allows Couch to gzip compress attachments
as they are
>>>> >> >> being
>>>> >> received and store them in compressed form.
>>>> >> >> When a client asks for downloading an attachment (e.g.
GET
>>>> >> somedb/somedoc/attachment.txt), the attachment is sent in compressed
>>>> >> form if
>>>> >> the client's http request has gzip specified as a valid transfer
>>>> >> encoding
>>>> >> for the response (using the http header "Accept-Encoding"). Otherwise
>>>> >> couch
>>>> >> decompresses the attachment before sending it back to the client.
>>>> >> >> Attachments are compressed only if their MIME type matches
one of
>>>> >> >> those
>>>> >> listed in a separate config file. Compression level is also
>>>> >> configurable in
>>>> >> the default.ini file.
>>>> >> >> This follows Damien's suggestion from 30 November:
>>>> >> >> "Perhaps we need a separate user editable ini file to specify
>>>> >> compressable or non-compressable files (would probably be too big
for
>>>> >> the
>>>> >> regular ini file). What do other web servers do?
>>>> >> >> Also, a potential optimization is to compress the file
while writing
>>>> >> >> to
>>>> >> disk, and serve the compressed bytes directly to clients that can
>>>> >> handle it,
>>>> >> and decompressed for those that can't. For compressable types, it's
a
>>>> >> win
>>>> >> for both disk IO for reads and writes, and CPU on read."
>>>> >> >> Patch attached.
>>>> >> >
>>>> >> > --
>>>> >> > This message is automatically generated by JIRA.
>>>> >> > -
>>>> >> > You can reply to this email to add a comment to the issue online.
>>>> >> >
>>>> >> >
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> > --
>>>> > Filipe David Manana,
>>>> > fdmanana@gmail.com
>>>> > PGP key - http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xC569452B
>>>> >
>>>> > "Reasonable men adapt themselves to the world.
>>>> > Unreasonable men adapt the world to themselves.
>>>> > That's why all progress depends on unreasonable men."
>>>> >
>>>
>>>
>>>
>>> --
>>> Filipe David Manana,
>>> fdmanana@gmail.com
>>> PGP key - http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xC569452B
>>>
>>> "Reasonable men adapt themselves to the world.
>>> Unreasonable men adapt the world to themselves.
>>> That's why all progress depends on unreasonable men."
>>>
>>>
>>
>

Mime
View raw message