couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <>
Subject Re: format of database sequence
Date Tue, 26 Oct 2010 20:33:22 GMT
On Oct 26, 2010, at 3:23 PM, Paul Davis wrote:

> On Tue, Oct 26, 2010 at 3:06 PM, Randall Leeds <> wrote:
>> I don't see any way to avoid this and we've been talking about it for a while.
>> +1
>> On Tue, Oct 26, 2010 at 11:59, Adam Kocoloski <> wrote:
>>> Hi all, I've been meaning to bring this up for a while.  CouchDB uses integer
sequence numbers in the _changes feed and update_seq values, but I don't see any sensible
way to preserve that interface in BigCouch.  The database sequence in BigCouch needs to combine
the sequences of several database shards; currently it's a string formatted like
>>> "1234-Base64Data"
>>> The first piece is the sum of the shard sequence numbers and is not actually
used by BigCouch.  The second piece is the actual data about the state of the cluster.  This
format causes a couple of issues:
>>> 1) the replicator occasionally sorts sequence numbers and when it does so, it
sorts the BigCouch ones lexicographically and concludes that e.g. "99-..." is the only checkpoint
it will ever need to store.
>>> 2) client libraries might not treat the sequence as an opaque data type and may
break when operating against a BigCouch.
>>> My personal preference would be to change the format of the Apache CouchDB sequence
to a string at the next major release.  Thoughts?
>>> Adam
> Is there a possibility to retain a guarantee of triangle inequality
> for update sequences? In my experience people are mostly using them to
> detect an ordering. It seems like if we could give a loose ordering,
> that'd be better than just an opaque type. Though, I'm also not sure
> how that'd work in relation to string comparison operators in most
> languages.
> I'm +1 on the switch, I'm just wondering if we can do it without
> making them completely opaque.
> Paul Davis

Logging some IRC discussion.  In my opinion the easiest way to get the cluster sequence to
sort nicely is to format it as a JSON array instead of a string

[1234, "Base64Data"]

Now, to be clear, sorting of the sequences is not really something that should be encouraged,
and in fact should probably should be actively discouraged.  Moreover, a proper sorting of
two sequences would need to inspect the Base64Data (a compressed vector clock) of each sequence
and do the usual vclock comparison logic.  Sorting on the prefixes is only a shorthand that
works when the same shards are used to generate both sequences.

rnewson advocated doing away with the prefix altogether and making it completely opaque. 
BigCouch used to work this way; we added the prefix because we found it convenient to get
a rough idea of the number of updates the DB had seen, but it's hardly necessary.  If we did
make it opaque we'd certainly need to fix the replicator so that it saves checkpoints regardless
of sorting order.


View raw message