Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 10AD510797 for ; Sun, 18 Aug 2013 06:23:31 +0000 (UTC) Received: (qmail 58420 invoked by uid 500); 18 Aug 2013 06:23:30 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 58386 invoked by uid 500); 18 Aug 2013 06:23:25 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 58331 invoked by uid 99); 18 Aug 2013 06:23:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Aug 2013 06:23:21 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of bchesneau@gmail.com designates 209.85.128.50 as permitted sender) Received: from [209.85.128.50] (HELO mail-qe0-f50.google.com) (209.85.128.50) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 18 Aug 2013 06:23:14 +0000 Received: by mail-qe0-f50.google.com with SMTP id q19so2007373qeb.37 for ; Sat, 17 Aug 2013 23:22:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=TT12nJH9XQS7ax3//NjHqnpgo7KS97nXQa5lJSpZxcQ=; b=0Ycd0WgK2OOAcZqdC77GZKNO7GuWb/o/6+t/XVzQgMt6UWuUwU+KUfjVVVz7qgzubo jBrNo4L7+9Xw16gnE9c6rEzW8k6dEwmP5B0xnVacREQeZ6X1MjkLCeqnHKbQPpoMftim y7V7mF99C7dNwtGMLVLOEk+os9PrGRUU61XhMK4Tdn4jd7Uz+DkkVLKwD/mHEXDHItfM xKyEyAqQk/qqL+Vn1Ao0ImZjtwbucqII3xxb0bRYN+si/BiScKHZMEb06Id0nUh5VMkk NsPFjZhq3mwXgcM0pbc5ruBGr1lHNlzfdi0pLN/V8VDUh8GahV6vX6+pqIWLxpdU8LVX xFzw== MIME-Version: 1.0 X-Received: by 10.49.2.195 with SMTP id 3mr7234401qew.15.1376806973605; Sat, 17 Aug 2013 23:22:53 -0700 (PDT) Received: by 10.49.29.228 with HTTP; Sat, 17 Aug 2013 23:22:53 -0700 (PDT) In-Reply-To: References: <57E7BFC7-8B8E-4014-8569-B03F99B73E35@apache.org> <520DEB55.8090203@gmail.com> <520DF58D.6040308@gmail.com> Date: Sun, 18 Aug 2013 08:22:53 +0200 Message-ID: Subject: Re: Erlang vs JavaScript From: Benoit Chesneau To: "dev@couchdb.apache.org" Cc: Jason Smith Content-Type: multipart/alternative; boundary=047d7b6da3d2960c8d04e432da72 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b6da3d2960c8d04e432da72 Content-Type: text/plain; charset=ISO-8859-1 On Fri, Aug 16, 2013 at 9:58 PM, Alexander Shorin wrote: > On Fri, Aug 16, 2013 at 11:23 PM, Jason Smith wrote: > > On Fri, Aug 16, 2013 at 4:49 PM, Volker Mische > > wrote: > >> > >> On 08/16/2013 11:32 AM, Alexander Shorin wrote: > >> > On Fri, Aug 16, 2013 at 1:12 PM, Benoit Chesneau > > >> > wrote: > >> >> I agree, (modulo the fact that I would replace a string by a binary > ;) > >> >> but > >> >> that would be only possible if we extract the metadata (_id, _rev) > from > >> >> the > >> >> JSON so couchdb wouldn't have to decode the JSON to get them. > Streaming > >> >> json would also allows that but since there is no guaranty in the > >> >> properties order of a JSON it would be less efficient. > >> > > >> > What if we split document metadata from document itself? > > > > > > I would like to hear a goal for this effort? What is the definition of > > success and failure? > > Idea: move document metadata into separate object. > How do you link the metadata to the separate object there? Do you let the application set the internal links? I'm +1 with such idea anyway. > Motivation: > > Case 1: Small docs. No profit at all. More over, probably it's better > to not split things there e.g. pass full doc if his size around some > amount of megabytes. > Case 2: Large docs. Profit in case when you have set right fields into > metadata (like doc type, authorship, tags etc.) and filter first by > this metadata - you have minimal memory footprint, you have less CPU > load, rule "fast accept - fast reject" works perfectly. > > Side effect: it's possible to first filter by metadata and leave only > required to process document ids. And if we known what and how many to > process, we may make assumptions about parallel indexation. > > Side effect: it's possible to autoindex metadata on fly on document > update without asking user to write (meta/by_type, meta/by_author, > meta/by_update_time etc. viiews) . Sure, as much metadata you have as > large base index will be. In 80% cases it will be no more than 4KB. > > Resume: probably, I'd just described chained views feature with > autoindexing by certain fields (: > Removing autoindexing feature and we could make views building process > much more faster if we make right views chain which will use set > algebra operations to calculate target doc ids to pass to final view: > reduce docs before map results: > > { > "views": { > "posts": {"map": "...", "reduce": "..."}, > "chain": [ > ["by_type", {"key": "post"}], > ["hidden", {"key": false}], > ["by_domain", {"keys": ["public", "wiki"]}] > ] > } > } > > In case of 10000 docs db with 1200 posts where 200 are hidden and 400 > are private, result view posts have to process only 600 docs instead > of 10000 and it's index lookup operation to find out the result docs > to pass. Sure, calling such view triggers all views in the chain. And > I don't think about cross dependencies and loops for know. > > -- > ,,,^..^,,, > --047d7b6da3d2960c8d04e432da72--