Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 15A4910DA2 for ; Tue, 19 Nov 2013 15:00:25 +0000 (UTC) Received: (qmail 49600 invoked by uid 500); 19 Nov 2013 15:00:24 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 49519 invoked by uid 500); 19 Nov 2013 15:00:23 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 49482 invoked by uid 99); 19 Nov 2013 15:00:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Nov 2013 15:00:23 +0000 X-ASF-Spam-Status: No, hits=0.1 required=5.0 tests=DATE_IN_PAST_12_24,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [80.244.253.218] (HELO mail.traeumt.net) (80.244.253.218) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Nov 2013 15:00:15 +0000 Received: from [10.0.0.12] (91-66-82-235-dynip.superkabel.de [91.66.82.235]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mail.traeumt.net (Postfix) with ESMTPSA id EF350145CA for ; Tue, 19 Nov 2013 16:03:48 +0100 (CET) From: Jan Lehnardt Content-Type: multipart/signed; boundary="Apple-Mail=_5575706A-831C-43E1-81A0-9F0F78B14C25"; protocol="application/pgp-signature"; micalg=pgp-sha512 Message-Id: <77EE56B9-D0ED-454A-BF99-7C335DD17D5F@apache.org> Mime-Version: 1.0 (Mac OS X Mail 7.0 \(1812\)) Subject: Re: Why does CouchDb need attachment length? Date: Mon, 18 Nov 2013 17:45:19 -0800 References: <0258BECD-D33B-4523-83AC-4E46C364148D@gmail.com> To: "dev@couchdb.apache.org Developers" In-Reply-To: X-Mailer: Apple Mail (2.1812) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_5575706A-831C-43E1-81A0-9F0F78B14C25 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 On 16 Nov 2013, at 13:31 , Nick North wrote: > One more thought before I leave off for the moment. Although this = endpoint was built for the replicator, it is very useful for other = clients, as it is the only way to submit a document and its attachments = in a single action. This is important if you're not allowed to update = documents or if you want to guarantee that readers of documents in the = database and its replicas never see a partial set of the document and = its attachments. This use case suggests to me that the endpoint should = be easy to use for everyone, if that can be done without harming = replication. But the chunking business means I need to think some more = before making a proposal on it. The API should totally work as simple as possible for clients other than = the replicator. It just hasn=92t been built that way yet and we are happy to = accept patches :) =97 The mention that is was custom built for the replicator = is just to explain the current limitations. That said, I think you either need a length OR chunking, but any self = respecting HTTP client should make that trivial for you as the end user :) Best Jan --=20 >=20 > Nick >=20 >> On 16 Nov 2013, at 18:57, Robert Newson wrote: >>=20 >> Ah, no. Http requires either content length or a chunked encoding. We = could >> certainly enhance this. My point was that this endpoint was built for = the >> replicator. >>> On 16 Nov 2013 18:54, "Nick North" wrote: >>>=20 >>> Thanks for the quick reply. I see what you're saying, though it = still >>> seems to me that CouchDb could accept incoming non-chunked requests = where >>> individual attachments do not have their lengths specified. They = could be >>> calculated on receipt and kept for use in replication. That would = make use >>> of client libraries like the Apache Java HttpClient easier. But = maybe my >>> lack of detailed knowledge of HTTP is showing. >>>=20 >>> Nick >>>=20 >>>> On 16 Nov 2013, at 18:24, Robert Newson wrote: >>>>=20 >>>> Because we haven't written the code to handle multipart/related >>>> responses where each item is also a chunked response, and we = haven't >>>> done that because the replicator could always form a non-chunked >>>> request since it already knows the sizes. >>>>=20 >>>> B. >>>>=20 >>>>=20 >>>>> On 16 November 2013 18:11, Nick North wrote: >>>>> I'm working with CouchDb documents with multiple attachments, = submitted >>>>> using MIME multipart/related requests. In this case the document = JSON >>> has >>>>> to have an "_attachments" property specifying each attachment's = name, >>>>> content type and length as described >>>>> here< >>> = http://wiki.apache.org/couchdb/HTTP_Document_API#Multiple_Attachments>. >>>>> The document and attachments are MIME-encoded and submitted in a = single >>>>> request. >>>>>=20 >>>>> Although this works, programming it is awkward as each = attachment's >>> length >>>>> must be known in advance in order to populate the _attachments = property. >>>>> Attachments are often in the form of streams, and finding the = length >>> means >>>>> having to read through the whole stream. Then you have to spool = through >>> the >>>>> stream again when submitting the HTTP request. (In some languages = I >>> suspect >>>>> the only way to do this is to buffer the entire stream contents in >>> memory.) >>>>> If the length did not have to be put into the initial JSON object, = then >>> the >>>>> stream could just be passed straight through to the HTTP request = with no >>>>> need for reading twice or buffering in memory. >>>>>=20 >>>>> So my question is: why does CouchDb require the length to be = supplied? >>> It's >>>>> definitely necessary as I've tried giving the wrong length, or no >>> length at >>>>> all, and that causes the request to fail. But a quick look at the = Erlang >>>>> source suggests that the length is not used when parsing the = request, >>> and >>>>> presumably that parsing process could calculate each attachment's = length >>>>> for use later on if it's needed. >>>>>=20 >>>>> If, in principle, the length could be dropped when submitting = requests, >>>>> then I'd be interested in trying to modify the code to make that >>> possible. >>>>> But, if there is a good reason why it has to be supplied, then I = don't >>> want >>>>> to waste time working out what's going on in the Erlang. So any = advice >>> on >>>>> why attachments were designed as they are would be very welcome. = Many >>>>> thanks, >>>>>=20 >>>>> Nick >>>=20 --Apple-Mail=_5575706A-831C-43E1-81A0-9F0F78B14C25 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- iQIcBAEBCgAGBQJSisKvAAoJENnuAeR4Uq7kfzYP/iYg5iaTOURa9sblqeV4Fi2a yf2RKbjdYNd8H3GSwQKLr33YJjUsdeTDZ/1h34RF094uv9XbGPfWaDA4q5xkrSBq DdyZWK6dClRrblYx+/FFUDyjwAhTSRgRp/xuEVak26eA7YzprCSVKFkfxRQ4t8LY lsDDd0o3dofp2tGOZG1fOAlpPgqYyPYGiM3eWgwr2/9Nl5AobdmEZwruCYq5cmV6 NZ4llxApYf3kU4CxyBCrEmc8Holwqe4k5sCiDVvwptW4J5ajq/WC8wSUy9rUDaEK 2G0Es9OhSO/clm2GyncXtWpoVy3BPsZRgypLxDSRbY14aKviLA86y4y2AcdSPmnC pE+Bjgc4Xfc5Pej1g0UK3wZ3YaBFcMs1PRny9H0pjb055oOoHJljZJFSbl3AEYMG hMuw7awp26i7dnOuEdCFgup6Fx8/p+8NEZ+B8CQDDo7uMBdCu8UMZdJfHlI2qmDL PDR5PE1DGW+RZRX6sT5Xw+z/DXXqkD+oVpEwFTlBmM89KdWBRSezkK9XtaTSkFtX Leav1VL6oDdL+A77pYILgF+kxoAWtnPtCSqOsYv32D/0T+e7fZpO1aNwkTt5pJwJ hgzRYUG2CSfyWhskXEXcZgrB0l+eECZmyJISbpO8aS9xgW+i092Lzg3YV9+Hn3VU fIPDLnquYfm9nQnwUJip =X9Dd -----END PGP SIGNATURE----- --Apple-Mail=_5575706A-831C-43E1-81A0-9F0F78B14C25--