incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colin <colpcl...@gmail.com>
Subject Re: Data model for streaming a large table in real time.
Date Sat, 07 Jun 2014 15:03:12 GMT
I believe Byteorderedpartitioner is being deprecated and for good reason.  I would look at
what you could achieve by using wide rows and murmur3partitioner.



--
Colin
320-221-9531


> On Jun 6, 2014, at 5:27 PM, Kevin Burton <burton@spinn3r.com> wrote:
> 
> We have the requirement to have clients read from our tables while they're being written.
> 
> Basically, any write that we make to cassandra needs to be sent out over the Internet
to our customers.
> 
> We also need them to resume so if they go offline, they can just pick up where they left
off.
> 
> They need to do this in parallel, so if we have 20 cassandra nodes, they can have 20
readers each efficiently (and without coordination) reading from our tables.
> 
> Here's how we're planning on doing it.
> 
> We're going to use the ByteOrderedPartitioner .
> 
> I'm writing with a primary key of the timestamp, however, in practice, this would yield
hotspots.
> 
> (I'm also aware that time isn't a very good pk in a distribute system as I can easily
have a collision so we're going to use a scheme similar to a uuid to make it unique per writer).
> 
> One node would take all the load, followed by the next node, etc.
> 
> So my plan to stop this is to prefix a slice ID to the timestamp.  This way each piece
of content has a unique ID, but the prefix will place it on a node.
> 
> The slide ID is just a byte… so this means there are 255 buckets in which I can place
data.  
> 
> This means I can have clients each start with a slice, and a timestamp, and page through
the data with tokens.
> 
> This way I can have a client reading with 255 threads from 255 regions in the cluster,
in parallel, without any hot spots.
> 
> Thoughts on this strategy?  
> 
> -- 
> Founder/CEO Spinn3r.com
> Location: San Francisco, CA
> Skype: burtonator
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
> War is peace. Freedom is slavery. Ignorance is strength. Corporations are people.

Mime
View raw message