On 30/01/2009, at 6:30 PM, Chris Anderson wrote:
> Ahh, I didn't consider the validation function as being replicated as
> well. I suppose I'm imagining that validation functions will define
> the borders of applications, and thinking of these data flows as
> within a particular application.
>
> This is a straightforward consequence of the fact that design docs are
> documents just like any other, which of course has so many good
> effects that it's hard to find fault with the system because of edge
> cases like this.
No fault with design docs being normal docs, and in fact I can't see
how making it otherwise could solve this problem.
> Another edge case from validation functions (which can happen even on
> a single node) is that documents which have been added to a db, can be
> invalidated by the addition of a validation function after they have
> been saved. Having a view of all newly-invalid docs will definitely be
> useful.
>
> It seems like if you want to ensure that your system follows some of
> the stricter principles you outlined, you'll have to avoid use of
> (changing) validation functions in your applications. Or at least be
> very thoughtful about code roll-outs.
That's assuming that there is such a thing as a roll-out. If your code
is replicated with your data, which it must be with design doc
functions because they are applicable to the db in which they reside,
then I can't seen an effective way to use those features with mesh
deployments.
An alternative might be to regard these non-functional features as
being a layer above the canonical store maintained by replication,
something like a view. So replication would never block, and the
underlying model would always guarantee that a global steady state is
reachable regardless of ordering.
And thinking of your 'newly-invalid' docs view, that feels like the
same kind of thing I'm suggesting.
I haven't thought more than that about what a solution would look
like, and I'm not sure at this point if partial replication is an
identical problem or not. My gut feel is that intermingling the
CouchDB-as-application features with CouchDB-as-replicated-document-
store functionality is problematic and requires enormous care, and a
more layered and partitioned approach might be prudent.
Regardless of the issue for meshes, it seems that using validation, or
any other non-functional feature that impacts the canonical data, as
opposed to derived data such as views, opens a real can of worms for
developers. Given your IRC comment about unfortunate memes being
generated by naive developers (c.f. single-node transactions), it
seems to me that if the underlying model presented by CouchDB becomes
more difficult to use or requires a more subtle understanding of the
very very hairy problem of distribution/global state etc, then the
meme will be 'CouchDB is impossible to get right'.
Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
The trouble with the world is that the stupid are cocksure and the
intelligent are full of doubt.
-- Bertrand Russell
|