couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick North <nort...@gmail.com>
Subject Re: Why does CouchDb need attachment length?
Date Sat, 16 Nov 2013 19:39:34 GMT
I had been thinking it would be easier to give a single overall content length and not chunk,
than to have to find individual lengths for every attachment before constructing the request.
That's on the grounds that HTTP client libraries might do the former by letting you set a
"no chunking" option but wouldn't help with the latter. But some quick checking suggests that
you can't force chunking behaviour in some clients so I will have to go and think some more
before deciding what I want. Thanks for your help.

Nick

> On 16 Nov 2013, at 18:57, Robert Newson <rnewson@apache.org> wrote:
> 
> Ah, no. Http requires either content length or a chunked encoding. We could
> certainly enhance this. My point was that this endpoint was built for the
> replicator.
>> On 16 Nov 2013 18:54, "Nick North" <north.n@gmail.com> wrote:
>> 
>> Thanks for the quick reply. I see what you're saying, though it still
>> seems to me that CouchDb could accept incoming non-chunked requests where
>> individual attachments do not have their lengths specified. They could be
>> calculated on receipt and kept for use in replication. That would make use
>> of client libraries like the Apache Java HttpClient easier. But maybe my
>> lack of detailed knowledge of HTTP is showing.
>> 
>> Nick
>> 
>>> On 16 Nov 2013, at 18:24, Robert Newson <rnewson@apache.org> wrote:
>>> 
>>> Because we haven't written the code to handle multipart/related
>>> responses where each item is also a chunked response, and we haven't
>>> done that because the replicator could always form a non-chunked
>>> request since it already knows the sizes.
>>> 
>>> B.
>>> 
>>> 
>>>> On 16 November 2013 18:11, Nick North <north.n@gmail.com> wrote:
>>>> I'm working with CouchDb documents with multiple attachments, submitted
>>>> using MIME multipart/related requests. In this case the document JSON
>> has
>>>> to have an "_attachments" property specifying each attachment's name,
>>>> content type and length as described
>>>> here<
>> http://wiki.apache.org/couchdb/HTTP_Document_API#Multiple_Attachments>.
>>>> The document and attachments are MIME-encoded and submitted in a single
>>>> request.
>>>> 
>>>> Although this works, programming it is awkward as each attachment's
>> length
>>>> must be known in advance in order to populate the _attachments property.
>>>> Attachments are often in the form of streams, and finding the length
>> means
>>>> having to read through the whole stream. Then you have to spool through
>> the
>>>> stream again when submitting the HTTP request. (In some languages I
>> suspect
>>>> the only way to do this is to buffer the entire stream contents in
>> memory.)
>>>> If the length did not have to be put into the initial JSON object, then
>> the
>>>> stream could just be passed straight through to the HTTP request with no
>>>> need for reading twice or buffering in memory.
>>>> 
>>>> So my question is: why does CouchDb require the length to be supplied?
>> It's
>>>> definitely necessary as I've tried giving the wrong length, or no
>> length at
>>>> all, and that causes the request to fail. But a quick look at the Erlang
>>>> source suggests that the length is not used when parsing the request,
>> and
>>>> presumably that parsing process could calculate each attachment's length
>>>> for use later on if it's needed.
>>>> 
>>>> If, in principle, the length could be dropped when submitting requests,
>>>> then I'd be interested in trying to modify the code to make that
>> possible.
>>>> But, if there is a good reason why it has to be supplied, then I don't
>> want
>>>> to waste time working out what's going on in the Erlang. So any advice
>> on
>>>> why attachments were designed as they are would be very welcome. Many
>>>> thanks,
>>>> 
>>>> Nick
>> 

Mime
View raw message