couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: couch_query_server refactoring
Date Thu, 12 Jul 2012 07:23:41 GMT
I actually see one of two ways. And which one we go with depends on
and odd question that I've never really asked (or seen discussed)
before.

Both patterns would have a single couchjs process that is threaded and
can respond to multiple requests simultaneously. Doing this gives us
an ability to maximize throughput using pipelining and the like. The
downfall is that if the couchjs process dies then it affects every
other message that was in transit through it. Although I think we can
mitigate a lot of this with some basic retry/exponential back off
logic.

One way we can do this is to use a similar approach to what we do now
but asyncronously. Ie, messages from Erlang to couchjs become tagged
tuples that a central process dispatches back and forth to clients.
There are a few issues here. Firstly, we're going to basically need
two caches of lookups which could be harmful to low latency
requirements. Ie, Erlang will have to keep a cache of tag/client
pairs, and the couchjs side will have to have some sort of lookup
table/cache thing for ddoc ids to JS_Contexts. The weird question I
have about this is, "What is the latency/bandwidth of stdio?" I've
never thought to try and benchmark such a thing but it seems possible
(no idea how likely though) that we could max out the capacity of a
single file descriptor for communication.

The second approach is to turn couchjs into a simple threaded network
server. This way the Erlang side would just open a socket per design
doc/context pair, and the couchjs side would be rather simple to
implement (considering the alternative). I like this approach because
it minimizes cache/lookups, uses more file descriptors (not sure if
this is a valid concern), and (most importantly) keeps most of the
complexity in Erlang where its easier to manage.

Either way, once either of those things is implemented we can
re-implement our external process management to not totally bork
servers under load as well as (hopefully) improve latency considerably
for any of our "RPC" things.

On Thu, Jul 12, 2012 at 1:57 AM, Benoit Chesneau <bchesneau@gmail.com> wrote:
> On Mon, Jul 9, 2012 at 3:59 PM, Paul Davis <paul.joseph.davis@gmail.com> wrote:
>> Benoît,
>>
>> Lately I've been contemplating removing a lot of the Erlang mechanics
>> for this by rewriting couchjs as a single process/multi threaded
>> application. I've seen a lot of issues related to our process handling
>> and I also think we can probably speed things up considerably if we
>> change how this works. Ie, if we move to an asynchronous message
>> passing interface instead of the serialized stdio interface we should
>> be able to get some nice speedups in throughput while also removing a
>> lot of the resource usage.
>>
>  Do you have a generic idea of the flow ? Or an example? I can be
> funded to do that work :)
>
> - benoît

Mime
View raw message