Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2D73310248 for ; Fri, 16 Aug 2013 09:13:21 +0000 (UTC) Received: (qmail 25418 invoked by uid 500); 16 Aug 2013 09:13:14 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 25375 invoked by uid 500); 16 Aug 2013 09:13:12 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 25358 invoked by uid 99); 16 Aug 2013 09:13:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Aug 2013 09:13:10 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of bchesneau@gmail.com designates 209.85.128.48 as permitted sender) Received: from [209.85.128.48] (HELO mail-qe0-f48.google.com) (209.85.128.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Aug 2013 09:13:06 +0000 Received: by mail-qe0-f48.google.com with SMTP id 9so1058028qea.35 for ; Fri, 16 Aug 2013 02:12:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=wxMnopUC0UsZy/tav1OLtahbivdm5orlX2vWAtmd2Cs=; b=l2Fcfgds2DzaSxuG/KVngAILUNx+KUVmH527IN7PWcHWl1FVvFHvdynvKtU8UeaRRE bZVSCS3CzgTQ0d+7X/ZFMixubLJnEQEUrotuK7IwWHHx1kftTv+s8YCgM1OlMyHvrQ8q q+y7sQNdz08i7xUS6iwdexBAiFfPd7inh66atOUCeVZdv5BNUtWXg0G7uXSOrs1ieKYr e/bRN8x6WCXxvMAOpjdGAiY9en3MSzHnjlGOhAv6s8IKiIibmGFmBhyOSnKyFdXDzuR0 Nv9Da9rbi2ATE9A35zJrBVDRvudRYHVGk5xDFl23GLFXdbjvkzo1ZJD+Up2ZB4oc/wIy ZOmg== MIME-Version: 1.0 X-Received: by 10.49.58.98 with SMTP id p2mr500298qeq.70.1376644365305; Fri, 16 Aug 2013 02:12:45 -0700 (PDT) Received: by 10.49.29.228 with HTTP; Fri, 16 Aug 2013 02:12:45 -0700 (PDT) In-Reply-To: <520DEB55.8090203@gmail.com> References: <57E7BFC7-8B8E-4014-8569-B03F99B73E35@apache.org> <520DEB55.8090203@gmail.com> Date: Fri, 16 Aug 2013 11:12:45 +0200 Message-ID: Subject: Re: Erlang vs JavaScript From: Benoit Chesneau To: Volker Mische Cc: "dev@couchdb.apache.org" , "user@couchdb.apache.org" Content-Type: multipart/alternative; boundary=047d7b2e7b026052f704e40cfe0e X-Virus-Checked: Checked by ClamAV on apache.org --047d7b2e7b026052f704e40cfe0e Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On Fri, Aug 16, 2013 at 11:05 AM, Volker Mische wr= ote: > On 08/15/2013 11:53 AM, Benoit Chesneau wrote: > > On Thu, Aug 15, 2013 at 11:38 AM, Jan Lehnardt wrote: > > > >> > >> On Aug 15, 2013, at 10:09 , Robert Newson wrote: > >> > >>> A big +1 to Jason's clarification of "erlang" vs "native". CouchDB > >>> could have shipped an erlang view server that worked in a separate > >>> process and had the stdio overhead, to combine the slowness of the > >>> protocol with the obtuseness of erlang. ;) > >>> > >>> Evaluating Javascript within the erlang VM process intrigues me, Jens= , > >>> how is that done in your case? I've not previously found the assertio= n > >>> that V8 would be faster than SpiderMonkey for a view server compellin= g > >>> since the bottleneck is almost never in the code evaluation, but I do > >>> support CouchDB switching to it for the synergy effects of a closer > >>> binding with node.js, but if it's running in the same process, that > >>> would change (though I don't immediately see why the same couldn't be > >>> done for SpiderMonkey). Off the top of my head, I don't know a safe > >>> way to evaluate JS in the VM. A NIF-based approach would either be > >>> quite elaborate or would trip all the scheduling problems that > >>> long-running NIF's are now notorious for. > >>> > >>> At a step removed, the view server protocol itself seems like the > >>> thing to improve on, it feels like that's the principal bottleneck. > >> > >> The code is here: > >> https://github.com/couchbase/couchdb/tree/master/src/mapreduce > >> > >> I=92d love for someone to pick this up and give CouchDB, say, a > ./configure > >> --enable-native-v8 option or a plugin that allows people to opt into t= he > >> speed improvements made there. :) > >> > >> The choice for V8 was made because of easier integration API and more > >> reliable releases as a standalone project, which I think was a smart > move. > >> > >> IIRC it relies on a change to CouchDB-y internals that has not made it > >> back from Couchbase to CouchDB (Filipe will know, but I doubt he=92s > reading > >> this thread), but we should look into that and get us =93native JS > views=94, at > >> least as an option or plugin. > >> > >> CCing dev@. > >> > >> Jan > >> -- > >> > >> > > Well on the first hand nifs look like a good idea but can be very > > problematic: > > > > - when the view computation take time it would block the full vm > > scheduling. It can be mitigated using a pool of threads to execute the > work > > asynchronously but then can create other problems like memory leaking > etc. > > - nifs can't be upgraded easily during hot upgrade > > - when a nif crash, all the vm crash. > > > > (Note that we have the same problem when using a nif to decode/encode > json, > > it only works well with medium sized documents) > > > > One other way to improve the js handling would be removing the main > > bottleneck ie the serialization-deserialization we do on each step. Not > > sure if it exists but feasible, why not passing erlang terms from erla= ng > > to js and js to erlang? So at the end the deserialization would happen > only > > on the JS side ie instead of having > > > > get erlang term > > encode to json > > send to js > > decode json > > process > > encode json > > send json > > decode json to erlang term > > store > > > > we sould just have > > > > get erlang term > > send over STDIO > > decode erlang term to JS object > > process > > encode to erlang term > > send erlang term > > store > > > > Erlang serialization is also very optimised. > > I think the ultimate goal should be to be as little > conversion/serialisation as possible, hence no conversion to Erlang > Terms at all. > > Input as string > Parsing to get ID > Store as string > > Send to JS as string > Process with JS > Store as string > > Cheers, > Volker > > > I agree, (modulo the fact that I would replace a string by a binary ;) but that would be only possible if we extract the metadata (_id, _rev) from the JSON so couchdb wouldn't have to decode the JSON to get them. Streaming json would also allows that but since there is no guaranty in the properties order of a JSON it would be less efficient. - benoit --047d7b2e7b026052f704e40cfe0e--