incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Dzielak <j...@keen.io>
Subject Re: Guaranteeing globally unique TimeUUID's in a high throughput distributed system
Date Sat, 16 Mar 2013 22:21:13 GMT
Ahh right on. I'm already using wide rows with a similar row key heuristic (basically YYYYMMDDHH,
pulled from the event_time). So I think I'm good there but hadn't thought about using a mod
instead - any in-practice advantages to that?

Excited to try composite columns for this- sounds ideal. Had a similar idea of concatenating
a UUID onto the event time manually but this looks the right, non-janky way to do that.

Would you just use a type 4 UUID then, since the range slicing/querying will be on the event_time
part? Or are there advantages to still using a time UUID with the thread/process uniqueness
tricks you mentioned?

Thanks Philip!  

On Saturday, March 16, 2013 at 2:56 PM, Philip O'Toole wrote:

> On Sat, Mar 16, 2013 at 2:50 PM, Josh Dzielak <josh@keen.io (mailto:josh@keen.io)>
wrote:
> > Thanks Philip. I see where you are coming from; that'd be much simpler and
> > avoid these bumps.
> >  
> > The only downside is that I'd have to separately maintain an index of event
> > timestamps that reflected when they happened according to the client. That
> > way when the client asks for 'events last Wednesday' I give them the right
> > answer even if the events were recorded in Cassandra today. I think it's at
> > least worth weighing against the other solution.
> >  
>  
>  
> Way ahead of you. Use wide-rows, and use the UUID to create a
> composite column key. like so:
>  
> event_time:UUID
>  
> This guarantees a unique ID for *every* event.
>  
> And use the "event_time % (some interval you choose)" as your row key
> (many events will then have this as their row key). This makes it easy
> to find the events within a given range by performing the modulo math
> on the requested time range (you must choose the interval as part of
> your design, and stick with it). You do not need a secondary index.
>  
> >  
> > On Saturday, March 16, 2013 at 2:40 PM, Philip O'Toole wrote:
> >  
> > On Sat, Mar 16, 2013 at 2:24 PM, Josh Dzielak <josh@keen.io (mailto:josh@keen.io)>
wrote:
> >  
> > I have a system where a client sends me arbitrary JSON events containing a
> > timestamp at millisecond resolution. The timestamp is used to generate
> > column names of type TimeUUIDType.
> >  
> > The problem I run into is this - if I client sends me 2 events with the same
> > timestamp, the TimeUUID that gets generated for each is the same, and we get
> > 1 insert and 1 update instead of 2 inserts. I might be running many
> > processes (in my case Storm supervisors) on the same node, so the
> > machine-specific part of the UUID doesn't help.
> >  
> > I have noticed how the Cassandra UUIDGen class lets you work around this. It
> > has a 'createTimeSafe' method that adds extra precision to the timestamp
> > such that you can actually get up to 10k unique UUID's for the same
> > millisecond. That works pretty good for a single process (although it's
> > still possible to go over 10k, it's unlikely in our actual production
> > scenario). It does make searches at boundary conditions a little
> > unpredictable – 'equal' may or may not work depending on whether extra ns
> > intervals were added – but I can live with that.)
> >  
> > However, this still leaves vulnerability across a distributed system. If 2
> > events arrive in 2 processes at the exact same millisecond, one will
> > overwrite the other. If events keep flowing to each process evenly over the
> > course of the millisecond, we'll be left with roughly half the events we
> > should have. To work around this, I add a distinct 'component id' to my row
> > keys that roughly equates to a Storm worker or a JVM process I can cheaply
> > synchronize.
> >  
> > The real problem is that this trick of adding ns intervals only works when
> > you are generating timestamps from the current time (or any time that's
> > always increasing). As I mentioned before, my client might be providing a
> > past or future timestamp, and I have to find a way to make sure each one is
> > unique.
> >  
> > For example, a client might send me 10k events with the same millisecond
> > timestamp today, and 10k again tomorrow. Using the standard Java library
> > stuff to generate UUID's, I'd end up with only 1 event stored, not 20,000.
> > The warning in UUIDGen.getTimeUUIDBytes is clear about this.
> >  
> >  
> > It is a mistake, IMHO, to use the timestamp contained within the event
> > to generate the time-based UUID. While it will work, it suffers from
> > exactly the problem you describe. Instead, use the clock of the host
> > system to generate the timestamp. In otherwords, the event timestamp
> > may be different from the timestamp in the UUID. In fact, it *will* be
> > different, if the rate gets fast enough (since the 100ns period clock
> > used to generate time-based UUIDs may not be fine-grained enough, and
> > the UUID timestamp will increase as explained by RFC4122).
> >  
> >  
> > Adapting the ns-adding 'trick' to this problem requires synchronized
> > external state (i.e. storing that the current ns interval for millisecond
> > 12330982383 is 1234, etc) - definitely a non-starter.
> >  
> > So, my dear, and far more seasoned Cassandra users, do you have any
> > suggestions for me?
> >  
> > Should I drop TimeUUID altogether and just make column names a combination
> > of millisecond and a big enough random part to be safe? e.g.
> > '1363467790212-a6c334fefda'. Would I be able to run proper slice queries if
> > I did this? What other problems might crop up? (It seems too easy :)
> >  
> > Or should I just create a normal random UUID for every event as the column
> > key and create the non-unique index by time in some other way?
> >  
> > Would appreciate any thoughts, suggestions, and off-the-wall ideas!
> >  
> > PS- I assume this could be a problem in any system (not just Cassandra)
> > where you want to use 'time' as a unique index yet might have multiple
> > records for the same time. So any solutions from other realms could be
> > useful too.
> >  
> > --
> > Josh Dzielak
> > VP Engineering • Keen IO
> > Twitter • @dzello
> > Mobile • 773-540-5264
> >  
>  
>  
>  



Mime
View raw message