From user-return-20837-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Thu May 17 22:47:48 2012 Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8781E9A7E for ; Thu, 17 May 2012 22:47:48 +0000 (UTC) Received: (qmail 85536 invoked by uid 500); 17 May 2012 22:47:47 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 85502 invoked by uid 500); 17 May 2012 22:47:46 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 85494 invoked by uid 99); 17 May 2012 22:47:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 May 2012 22:47:46 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jens@couchbase.com designates 206.225.164.32 as permitted sender) Received: from [206.225.164.32] (HELO EXHUB020-5.exch020.serverdata.net) (206.225.164.32) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 May 2012 22:47:41 +0000 Received: from EXVMBX020-1.exch020.serverdata.net ([169.254.4.176]) by EXHUB020-5.exch020.serverdata.net ([206.225.164.32]) with mapi; Thu, 17 May 2012 15:47:15 -0700 From: Jens Alfke To: "user@couchdb.apache.org" Date: Thu, 17 May 2012 15:47:13 -0700 Subject: BigCouch returns compressed attachments without indicating they're compressed Thread-Topic: BigCouch returns compressed attachments without indicating they're compressed Thread-Index: Ac00fwEjm7ihixg1Rgu4HlvPh3V2dg== Message-ID: <7B3C5C94-9EB6-4297-9925-2E046B8B3105@couchbase.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 I=92m having (or rather, TouchDB is having*) problems receiving documents w= ith attachments from Cloudant; I assume this is a difference between BigCou= ch and CouchDB. I believe it's a bug in the server. The issue is that the server is returning compressed attachment bodies with= out indicating that they=92re compressed. TouchDB barfs because the length = of the received data doesn=92t match the =93length=94 property in the _atta= chments entry, and there is no "encoded_length" property giving the encoded= length, let alone an "encoding" property that indicates that the data's be= en compressed (and by what algorithm.) For example, take this document which has a 5313-byte HTML attachment. A plain GET returns: > {"_id":"readme","_rev":"2-4eb511f5ad0707c6e9fb1160b3f0bedd","_attachments= ":{"README.html":{"content_type":"text\/html","revpos":2,"digest":"md5-DRLe= nhWRAAAW9Q0RHyrG+w=3D=3D","length":5313,"stub":true}}} If I ask for the attachment inline I get: > {"_id":"readme","_rev":"2-4eb511f5ad0707c6e9fb1160b3f0bedd","_attachments= ":{"README.html":{"content_type":"text\/html","revpos":2,"digest":"md5-DRLe= nhWRAAAW9Q0RHyrG+w=3D=3D","data":"PGgxIGlkPSJ0b3=85{{{lots of Base64 data}}= }..."}}} where the base64 data decodes to 2136 bytes, and is not HTML but GZIPped HT= ML. Asking for the document with attachments in MIME multipart format results i= n: > --fbd433e586402848d98875903ea97f67 > content-type: application/json >=20 > {"_id":"readme","_rev":"2-4eb511f5ad0707c6e9fb1160b3f0bedd","_attachments= ":{"README.html":{"content_type":"text\/html","revpos":2,"digest":"md5-DRLe= nhWRAAAW9Q0RHyrG+w=3D=3D","length":5313,"follows":true}}} > --fbd433e586402848d98875903ea97f67 > {{{2136 bytes of GZIP data}}} > --fbd433e586402848d98875903ea97f67=97 Same thing =97 the data is GZIPped but there is no metadata to indicate the= fact. I believe this is a bug in BigCouch. It results in an ambiguity as to wheth= er the content is encoded or not (and if so, what encoding is being used.) = In the worst case you could have an attachment whose GZIPped encoding is ex= actly the same length as the raw data, in which case there would be no way = to tell whether it was encoded or not since the lengths would match either = way. =97Jens * https://github.com/couchbaselabs/TouchDB-iOS/issues/80=