couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Davis" <>
Subject Re: near view state server success
Date Sat, 08 Nov 2008 06:52:21 GMT
On Sat, Nov 8, 2008 at 1:36 AM, Chris Anderson <> wrote:
> On Mon, Oct 27, 2008 at 1:31 PM, Damien Katz <> wrote:
>> I'm bringing this conversation into couchdb-dev in case others are
>> interested.
>> -Damien
>> Begin forwarded message:
>>> From: Damien Katz <>
>>> Date: October 27, 2008 3:43:15 PM CDT
>>> To: Chris Anderson <>
>>> Subject: Re: near view state server success
>>> Overall this looks goo,d but it looks you've gone down a little bit of a
>>> wrong path withe purge_seq nums. The "clients" should only be sending update
>>> seq nums in the get_updated messages, and the view_group servers should
>>> never worry about the purge_seq_nums stuff, only the updater process worries
>>> about it, and for now, don't worry about it at all.
> Cool. I pulled out the purge stuff, and I'll just ignore the failing
> purge test for now. One of these days I'll sit down and come to
> understand the full purge implementation, but for now it's probably
> better to skip it.
>>> I think the best options
>>> is to switch to gen_server and have it send a "resend" response to the
>>> client, and the client resends the message.
> I'm just not sure how to implement this. When I start to think it
> through, I always come to the conclusion that the way I'm doing it now
> is simpler. I understand that gen_server has its benefits, but
> everything I think of ends up having a explicit receive and ! calls
> somewhere, even if they end up living in the client around the place
> where it expects the "resend" response.
> Keeping the interlocking receive calls in a function like
> couch_view_group:server_loop just seems like the simplest option.
> Maybe you see a clear way to do this with gen_server. I just can't see
> how to go down that route without just reimplementing something like
> couch_view_group:server_loop by another name in another module.
>>> The seperate SpawnFun and SpawnArgs isn't necessary, because you can use a
>>> closure to get the spawn and create the custom function:
>>> SpawnArgs = Foo(),
>>> SpawnFun  = fun () -> do_spawn(SpawnArgs) end,
> Duh, thanks. The code is a bit simpler now, and just as fast. Erlang
> closures FTW.
>>> Finally I think the initial update seq num in the couch_view_group server
>>> should be -1, instead of 0.
> Done.
> All these changes are available in the update-false branch of my git repo.
> Paul Davis and I had a chat on IRC about the fact that the name of
> this feature is not a very good description for what it actually
> provides. Basically the only win this internal cache reliably provides
> is the possibility of lower latency (with the tradeoff of out-of-date
> views). It doesn't give the ability to "peek" at intermediate and
> potentially inconsistent view states while the view is building, nor
> does it give you the ability to pull the most recent consistent view
> state from disk if the view hasn't yet been accessed since server
> boot.
> The best succinct name I could come up with for this feature is
> stale=ok, because in the case of an ungenerated or uncached view, it
> could potentially behave just like a normal view request (that is,
> wait to respond until the view is updated.)
> I'm mostly excited about this work because it lays the foundation for
> a way to get progress reports on currently building views. We could
> just add another receive clause for group_status that adds the latest
> status to the state, and then an http api for asking the view_group
> for it's status.
> It's really too bad that gen_server doesn't provide an easy way to
> stick a message back into the mailbox after initiating an action based
> on it. That would make all this so simple.
> --
> Chris Anderson

Also, for those interested parties paying attention I'd like to see if
we can get a raise of hands on the underlying issue that me and Chris
spent quite some time discussing.

As I see it, there are three main theoretical points:

1. (My interpretation) Give me view results with a guranteed
millisecond response time even if it means throwing an error or just
returning no results.
2. (What Chris argues for) Try and give me a quick response, but wait
for consistent data if need be. Chris makes the good argument that
this method relies on the dev/admin team to ensure the view generation
is never too far out of date.
3. A subtle difference that only makes sense with knowledge of the
internals, but boils down to "give me a true dirty read of a view
generation in progress"

I think all three are valid positions to take. Yet we all have our
preconceived notions on what the feature would be used for and how it
might be implemented. If everyone agrees on a specific interpretation
I think it'd help me and Chris from going 15 rounds when we don't have
a white board to draw furiously on.


View raw message