Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8F8B676F1 for ; Tue, 4 Oct 2011 23:43:19 +0000 (UTC) Received: (qmail 12016 invoked by uid 500); 4 Oct 2011 23:43:19 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 11988 invoked by uid 500); 4 Oct 2011 23:43:19 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 11980 invoked by uid 99); 4 Oct 2011 23:43:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Oct 2011 23:43:19 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of paul.joseph.davis@gmail.com designates 209.85.220.180 as permitted sender) Received: from [209.85.220.180] (HELO mail-vx0-f180.google.com) (209.85.220.180) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Oct 2011 23:43:11 +0000 Received: by vcbf11 with SMTP id f11so1396337vcb.11 for ; Tue, 04 Oct 2011 16:42:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=bkyrhNd69i0eygKfcFSHZAMiTjHb9Aji4vWl4MbhNsA=; b=X/cJDD1col5mhhvvPkzOIediwY79HTEOjKjKnr+X4KvEno2Zw4L7Slc/a1iFbw5dAy 24X/GGx2w6fL8e4IEeQvjjMcLCOHdNUKpr+K0L2lwMPATn078VWx/tSrDJzaBeGD0qqi A/K9KWoxVm3mL7ga4hkuYcgDhW7ag4MnOqb/Y= Received: by 10.52.28.6 with SMTP id x6mr1894475vdg.314.1317771770101; Tue, 04 Oct 2011 16:42:50 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.73.134 with HTTP; Tue, 4 Oct 2011 16:42:10 -0700 (PDT) In-Reply-To: References: From: Paul Davis Date: Tue, 4 Oct 2011 18:42:10 -0500 Message-ID: Subject: Re: Universal Binary JSON in CouchDB To: dev@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Tue, Oct 4, 2011 at 3:18 PM, Robert Newson wrote: > -1 > Such a Debbie Downer. > Supporting multiple formats on disk would be a very difficult code > change that would complicate every part of the system, I don't think > it's worth it. > Its not necessarily multiple formats, just one that we might be able to serve (almost) directly to clients. Obviously this is hard and has quite a few caveats if we decide to change away from Erlang's external term format. But as it is, ubjson is basically the same thing as Erlang's external term format just not Erlang specific. If there's a possibility of it making a difference I see no reason to investigate it. But I maintain that such a change would be quite large and impact a large portion of the code base. So if there is a change to be proposed someone will have to champion it, write it, test it, and then convince everyone else that its worth it. > If we were to contemplate just multiple http payload formats, I would > rather support one with broader acceptance (and with the caveat that > it would have to have some compelling reason beyond being just another > format). I'm aware of Tim's work on messagepack but I believe it's run > aground for the technical reasons I alluded to above. > Not sure what you point the allusion was too. MessagePack is nice but lacks some features that would be required by behaviors for CouchDB. Only because Tim suggested MessagePack did I know to suggest things like a noop type and unbounded container lengths. > Bottom line: I'd focus on optimizing the JSON encode/decode layer > first before considering anything as dramatic as this. Paul Davis > wrote a very fast JSON encoder/decoder called 'jiffy'. I would like to > hear more about that. > I have. I think I have a very subtle bug cause I saw a single segfault once so I haven't pushed to hard on getting it into trunk before other people test it. I think this goes back to Tim's talk though and my initial reaction to MessagePack. I'm sure that its probably faster and is definitely smaller than the corresponding JSON. And I can probably show that by writing hand optimized encoder/decoder pairs for both. The issue is that we can't support an encoder for every client language. So if there's a reasonable spec that makes it easier for Ada or BrainFuck to parse more efficiently and doesn't upset the internals too greatly, then I see no reason to investigate. > B. > > On 4 October 2011 21:08, Benoit Chesneau wrote: >> On Tue, Oct 4, 2011 at 9:33 PM, Paul Davis = wrote: >>> For a first step I'd prefer to see a patch that makes the HTTP >>> responses choose a content type based on accept headers. Once we see >>> what that looks like and how/if it changes performance then *maybe* we >>> can start talking about on disk formats. Changing how we store things >>> on disk is a fairly high impact change that we'll need to consider >>> carefully. >> >> +1 >>> >>> That said, the ubjson spec is starting to look reasonable and capable >>> to be an alternative content-type produced by CouchDB. If someone were >>> to write a patch I'd review it quite enthusiastically. >>> >>> >> >> I think I would prefer to use protobuffs format though. Anyway if wwe >> change the api to handle all types that would be pluggable without >> problem. >> >> - beno=EEt >> >