cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6108) Create timeid64 type
Date Mon, 23 Jun 2014 11:34:25 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040646#comment-14040646
] 

Sylvain Lebresne commented on CASSANDRA-6108:
---------------------------------------------

bq. How do we safely manage the life cycle of the ids?

Not easily for sure. I suppose we could add a per-client "I'm out" message in the protocol
and say that if your clients are not well-behaved and don't send those messages, then well,
fix them, but that's clearly not ideal. But in fact, when I mentioned this idea (which was
in no way a well though thing), I was really thinking of ID per-connections, not per clients,
but that would require a lot more ID than is necessary for correctness which is definitively
far from ideal.

All that said, I think it's worth taking a step back on this ticket, on what we need and what
are our constraints. Typically, this ticker per-se, to have a time64 CQL type, is not, imho,
the most important thing we care about. What we need is a way to make our cell timestamp cluster-wide
unique both for CASSANDRA-6123 and for CASSANDRA-7056, and we obviously want that new "better"
timestamp to not be overly big. This ticket is more a "by the way, if we add that and it's
more compact that a timeuuid, then let's expose it for columns too".

In particular, I don't think the fact that it's a 64bits ID should be an absolute strong requirement.

And in fact, I would suggest that we *seriously* consider just using a timeuuid. Yes, a timeuuid
is 128bits long which at face falue feels excessive for a per-cell thing. However, we can
relatively easily optimize their storage (at least on disk, but probably in memory some additional
efforts): the "clock and sequence" part of the timeuuid will basically be our per-client unique
ID and a per-sstable dictionary should be reasonably efficient, and the timestamp can be stored
as a delta from a per-sstable epoch. Overall, it should be relatively easy to get something
more compact than what we currently have, which is imo a good bar for "acceptable" in term
of compactness.

But a bonus is that if we do that systematically for timeuuid, we'll get compaction of existing
tables using timeuuid. Basically, instead of inventing something new and ask everyone to use
that from now on (which is pretty painful for users), let's reuse what we have and is somewhat
standard and optimize it.  The big advantage being simplicity: for drivers that won't have
to implement new potentially complex schemes (all drivers have a timeuuid generator already),
for users that won't have to use a new type and probably for us too as that's probably the
simper route. As far as I'm concerned, those advantages out-weight the downside of having
a slightly less compact representation than if we were to hand-craft something custom.




> Create timeid64 type
> --------------------
>
>                 Key: CASSANDRA-6108
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6108
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: API, Core
>            Reporter: Jonathan Ellis
>            Assignee: Sylvain Lebresne
>            Priority: Minor
>             Fix For: 2.1.1
>
>
> As discussed in CASSANDRA-6106, we could create a 64-bit type with 48 bits of timestamp
and 16 bites of unique coordinator id.  This would give us a unique-per-cluster value that
could be used as a more compact replacement for many TimeUUID uses.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message