incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Boxenhorn <da...@lookin2.com>
Subject Re: UUIDs whose alphanumeric order is the same as their chronological order
Date Wed, 23 Jun 2010 06:54:44 GMT
Having a physical location encoded in the UUID *increases* the chance of a
collision, because it means fewer random bits. There definitely will be more
than one UUID created in the same clock unit on the same machine! The same
bits that you use to encode your few servers can be used for over 100
trillion random numbers!

"As to ordering, if you wanted to use time-uuids, comparators that do
give time-based ordering are trivial, and no slower than lexical
sorting."

"No slower" isn't a good reason to use it! I am willing to take a
(reasonable) time *penalty* to use lexically ordered UUIDs that will work
both in Cassandra and Oracle (and which are human-readable - always good for
debugging)!

I am also willing to take a reasonable penalty to avoid using weird
third-party code for generating UUIDs in the first place.

On Tue, Jun 22, 2010 at 10:05 PM, Tatu Saloranta <tsaloranta@gmail.com>wrote:

> On Tue, Jun 22, 2010 at 9:12 AM, David Boxenhorn <david@lookin2.com>
> wrote:
> > A little bit of time fuzziness on the order of a few milliseconds is fine
> > with me. This is user-generated data, so it only has to be time-ordered
> at
> > the level that a user can perceive.
>
> Ok, so mostly ordered. :-)
>
> > I have no worries about my solution working - I'm sure it will work. I
> just
> > wonder if TimeUUIDType isn't superior for some reason that I don't know
> > about. (TimeUUIDType seems so bad in so many ways that I wonder why
> anyone
> > uses it. There must be some reason!)
>
> I think that rationally thinking random-number based UUID is the best,
> provided one has a good random number generator.
> But there is something intuitive about rather using location +
> time-based alternative, based on tiny chance of collision that any
> (pseudo) random number based system has.
> So it just seems intuitive safer to use time-uuids, I think -- it
> isn't, it just feels that way. :-)
>
> Secondary reason is probably the ordering, and desire to stay
> standards compliant.
> As to ordering, if you wanted to use time-uuids, comparators that do
> give time-based ordering are trivial, and no slower than lexical
> sorting.
> Java Uuid Generator (2.0) defaults to such comparator, as I agree that
> this makes more sense than whatever sorting you would otherwise get.
> It is unfortunate that clock chunks are ordered in weird way by uuid
> specification; there is no reason it couldn't have been made "right
> way" so that hex representation would sort nicely.
>
> -+ Tatu +-
>

Mime
View raw message