couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Unicode normalization (was Re: The 1.0 Thread)
Date Tue, 30 Jun 2009 15:13:44 GMT
On Tue, Jun 30, 2009 at 11:04 AM, Damien Katz<damien@apache.org> wrote:
> Definitely don't buffer the whole thing. Footers are the way to go for
> efficiency reasons, but if there is a problem supporting them the
> alternative is to just read over the section of file and compute the MD5
> hash, then read it again and stream it to the client. It sounds more
> expensive to do it that way, but double reading eliminates double caching as
> file system cache will keep it in memory most of the time anyway. And the FS
> cache is smarter about not swapping out more important data if the section
> is large.
>

This works fine when the range isn't expensive to calculate.

In the end I'd agree that prohibiting an MD5 purely because its a
range request seems silly when it should more likely be an
implementation decision.

> But footers are definitely the way to go for efficiency. Which brings up a
> good question, are there known problems with footer support?
>

The only problem I can think of is that I can't think of anyone that
supports them. That could just be because I've also never seen
anything that actually sends them and thus not had to look for client
support.

> -Damien
>
>
> On Jun 30, 2009, at 10:44 AM, Paul Davis wrote:
>
>> On Tue, Jun 30, 2009 at 7:12 AM, Damien Katz<damien@apache.org> wrote:
>>>
>>> On Jun 30, 2009, at 12:17 AM, Noah Slater wrote:
>>>
>>>> On Fri, Jun 26, 2009 at 07:08:32AM -0400, Damien Katz wrote:
>>>>>
>>>>> Md5 here is for integrity purposes, not security, so manufactured
>>>>> collisions aren't a problem we are worried about. And I don't think
>>>>> there is standard SHA1 header, not that I could find anyway.
>>>>
>>>> I've been seeing some unrelated emails go past on the W3C HTTP WG
>>>> mailing
>>>> list
>>>> about Content-MD5 header which reminded me of this thread. It seems that
>>>> this
>>>> value must be calculated from the MIME canonical response body, which
>>>> means a
>>>> different value for content ranges. This presumably means that CouchDB
>>>> must
>>>> refuse content range requests, send an MD5 value that does not match the
>>>> document revision, or break RFC 1864.
>>>
>>> Im not sure I understand why we can't just calculate and send the MD5
>>> header
>>> for the content range.
>>>
>>
>> I reckon you'd have to buffer the response no? Hard to know the MD5 of
>> an a priori unknown set of bytes until the end of the range which
>> kinda conflicts with sending the MD5 as a header. Technically there
>> are HTTP Footers, but i've never actually seen them used.
>>
>>> -Damien
>>>
>>>>
>>>> Best,
>>>>
>>>> --
>>>> Noah Slater, http://tumbolia.org/nslater
>>>
>>>
>
>

Mime
View raw message