Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@couchdb.apache.org
Received-SPF: pass (nike.apache.org: domain of bchesneau@gmail.com designates
 209.85.210.169 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <0377F01C-5F5D-48C5-8FDA-F9269B480324@apache.org>
References: 
 <CAN-3CBK3Pq3=kTy82T4LR3Fr27k_u-=QFSo2VRwKHnw4fC-eTg@mail.gmail.com>
	<CAKmKYaCUgd+oxyjP4BcOXz94_2OJOB6du_1hF-nS30EtMGaY0A@mail.gmail.com>
	<CAN-3CBLtSwGSQ8JxMGjFSfAtGOWtr1gjJdvdVRYcsDg3a2-ogw@mail.gmail.com>
	<5102B439.10500@lymegreen.co.uk>
	<CAN-3CBK+yWV0oQX7f1Hsj0XpORvrjpJ7dmEhn4bpHLryKDVmAw@mail.gmail.com>
	<CAFUSCh_knsU-yg-bBu0v8Labq11tYJfcTB1VLrGvFNJkE4YpLQ@mail.gmail.com>
	<CAN-3CBLS_iQJB=n7VVHJ288OOu+5JO-6oJwR=760KmwKE9huQQ@mail.gmail.com>
	<CAFUSCh9zoCvJU638s7n72+oAmJMSzmM9+oett-=Hiccx-hwjhw@mail.gmail.com>
	<CAN-3CBKMdDK7uv4-pKJn5ziPU=my23jXA6S0SQNAP+Jw2bHsow@mail.gmail.com>
	<CAJNb-9oVW9AfngCmd7bJjU34pqsouxYtScAhqCB5p0zUXh6obw@mail.gmail.com>
	<CAJ_m3YCDEtQ8HKSZ8q1uCBaP5Gg08pUxv+4-kX_+xQfjPgt3Hw@mail.gmail.com>
	<CAL+Y1nsoN-voZBexDVO3_xNAdy318uyazXsLyezB2U9vbvv-fQ@mail.gmail.com>
	<CAN-3CBLhsi2hG1XXOFgZNXYq+1Nm8Rb7Gsvq=tmTPaLYO_2r4g@mail.gmail.com>
	<CAAL6JQjUY9C9K7RfjVYJm-kGwNXq4vWEPJti2_bho_FPA9ukTg@mail.gmail.com>
	<CAN-3CB+NDKbWFuXRJ16X5Lv5yF59LpasgzrpJe1sqp-v0YvNBg@mail.gmail.com>
	<CAJNb-9oKme7ALBCu9ba=qg9n3nhvk7n4Q-kdYgKC3MzzNJ5XGw@mail.gmail.com>
	<91019D9B-A6AE-4311-9C11-E6001D57293A@apache.org>
	<CAN-3CBLnHMMjJjRVvd249ozaUWjYpTc2QjbkELs+cQs5SfAXug@mail.gmail.com>
	<CAJNb-9porEGSqUci+eHX434RzZhKXQq7EfDCgXtdGFjZFPHSMA@mail.gmail.com>
	<CAN-3CBJBtVmsvO=bQQ_aXRvnMGvaNat1oJK2gXphL=DDpfCChg@mail.gmail.com>
	<CAJNb-9o1CAWVO4FsMmKEL6oQJHX-vsd==r_oMyDyXp5mLqFR=A@mail.gmail.com>
	<5DD875C7-7E61-4FB8-A79C-779363AE5659@apache.org>
	<CAJNb-9rezskmO=YGeWheTaaX0tOerecVFUu=S1rnrjdTUBPXig@mail.gmail.com>
	<0377F01C-5F5D-48C5-8FDA-F9269B480324@apache.org>
Date: Tue, 5 Feb 2013 08:26:10 +0100
Message-ID: 
 <CAJNb-9r-4rf1o3XdhVnhKFzsCH3R71pS+Xk6q9wrvQkY1TCzXA@mail.gmail.com>
Subject: Re: Branch to switch from SpiderMonkey to Node.js
From: Benoit Chesneau <bchesneau@gmail.com>
To: "dev@couchdb.apache.org" <dev@couchdb.apache.org>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

On Mon, Feb 4, 2013 at 6:58 PM, Jan Lehnardt <jan@apache.org> wrote:
>
> On Feb 4, 2013, at 18:47 , Benoit Chesneau <bchesneau@gmail.com> wrote:
>
>> On Mon, Feb 4, 2013 at 12:10 PM, Jan Lehnardt <jan@apache.org> wrote:
>>>
>>> On Feb 4, 2013, at 11:53 , Benoit Chesneau <bchesneau@gmail.com> wrote:
>>>
>>>> On Thu, Jan 31, 2013 at 5:54 PM, Jason Smith <jhs@iriscouch.com> wrote=
:
>>>>>
>>>>> On Thu, Jan 31, 2013 at 4:34 PM, Benoit Chesneau <bchesneau@gmail.com=
>wrote:
>>>>>
>>>>>>
>>>>>> A javascript engine doesn't expose any IO par default. The **framewo=
rk**
>>>>>> nodejs is, this is all the point. I'm quite interested by the existi=
ng
>>>>>> solutions to sandbox nodejs, do you know some projects that does it?
>>>>>>
>>>>>
>>>>> Correct. I am attempting to build something which satisfies your
>>>>> description: no i/o; i/o is not even possible.
>>>>>
>>>>> *How* is it implemented? Well, it doesn't matter whether we use Node.=
js or
>>>>> couchjs/SM or couchjs/v8. What matters is we feel confident about sec=
urity.
>>>>> And of course, I agree, if we cannot achieve good security, then that=
 is a
>>>>> show stopper.
>>>>>
>>>>> Here is my current plan for sandboxing CouchJS. (Thanks to Isaac for =
his
>>>>> tips.)
>>>>>
>>>>> When it is time to evaluate some code:
>>>>>
>>>>> 1. Set up an object with safe variable bindings: safe_context
>>>>> 2. fork()
>>>>> 3. Child process runs vm.runInNewContext(safe_context)
>>>>> 4. Child process communicates to the parent over stdio, through the
>>>>> approved safe_context functions
>>>>>
>>>>> The subprocess can also give extra sandboxing, such as chroot() if
>>>>> available.
>>>>>
>>>>> Yes, this causes two processes per instantiation; however I think the
>>>>> parent might only be short-lived, setting up the security, then exiti=
ng.
>>>>> The grandchild can talk to Erlang over stdio.
>>>>>
>>>>> That is my plan. No idea how well it will work.
>>>>>
>>>>> --
>>>>> Iris Couch
>>>>
>>>>
>>>>
>>>> Too much kool-aid imo :)
>>>>
>>>> This is not that it can't work. But are you seriously considering to
>>>> have a main couchjs process maintaining the STDIO channel and spawn a
>>>> new OS Process for a view (which what does  `vm.runInNewContext`)? The
>>>> memory and latency cost can became very important, and i don't count
>>>> the chrooting cost especially if run this context on each indexation
>>>> batch or shows, lists and views requests. + the extra fds created by
>>>> each child contexts.
>>>
>>> Alternatively, if the above works and is necessary (modulo Klaus=92s
>>> research), we live with the hit until we get to rewrite the view protoc=
ol
>>> at which point we can make it 1 Erlang process -> 1 node process for
>>> dispatching -> N Node processes for indexing.
>>
>> I don't think it is necessary at all to use so many *OS* process at
>> all for our purpose. And I am really worried by such solution.There is
>> a reason why people don't try to launch too much OS processes on the
>> system,  There is a reason why we are using systems like Erlang.
>>
>> I guess runInContext would work, with a custom `require` function to
>> include modules (to specifically forbid IO) . According to the doc the
>> context doesn't share anything, which is what we want. Also if we are
>> going for node i would prefer to start with a straight forward
>> solution and not introduce any new behaviours.
>
> I suggested 1 extra node process in total, if at all, as an alternative,
> if the thing Klaus and you outline doesn=92t work.
>
Why doesn't it work?

runInNewContext would imply to launch one new context / view if you
want to really run it sandboxed.

"vm.runInNewContext compiles code, then runs it in sandbox and returns
the result.".

I don't see any other way since you can't recycle a context in this
case. Having another I/O for this context wil be even uglier. In that
case you would have STDIO -> CHILD -> STDIO -> CHILD . Without
counting the memory usage it will add more latency than we have right
now. The more I think about that the more I'm reluctant to support
such solution.

- beno=EEt