cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Data Craftsman <database.crafts...@gmail.com>
Subject Re: Wide row column slicing - row size shard limit
Date Thu, 16 Feb 2012 23:41:18 GMT
Hi Aaron Morton and   R. Verlangen,

Thanks for the quick answer. It's good to know Thrift's limit on the amount
of data it will accept / send.

I know the hard limit is 2 billion columns per row. My question is at what
size it will slowdown read/write performance and maintenance.  The blog I
reference said the row size should be less than 10MB.

It'll be better if Cassandra can transparently shard/split the wide row and
then distribute them to many nodes, to help the load balancing.

Are there any other ways to model historical data
(or time-series-data) besides wide row column slicing in Cassandra?

Thanks,
Charlie | Data Solution Architect Developer
http://mujiang.blogspot.com



On Thu, Feb 16, 2012 at 12:38 AM, aaron morton <aaron@thelastpickle.com>wrote:

> > Based on this blog of Basic Time Series with Cassandra data modeling,
> > http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/
> I've not read that one but it sounds right. Mat Dennis knows his stuff
> http://www.slideshare.net/mattdennis/cassandra-nyc-2011-data-modeling
>
> > There is a limit on how big the row size can be before slowing down the
> update and query performance, that is 10MB or less.
> There is no hard limit. Wide rows wont upset writes too much. Some read
> queries can avoid problems but most will not.
>
> Wide rows are a pain when it comes to maintenance.  They take longer to
> compact and repair.
>
> > Is this still true in Cassandra latest version? or in what release
> Cassandra will remove this limit?
> There is a limit of 2 billion columns per row. There is a not a limit of
> 10MB per row. I've seen some rows in the 100's of MB and they are always a
> pain.
>
> > Manually sharding the wide row will increase the application complexity,
> it would be better if Cassandra can handle it transparently.
> it's not that hard :)
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 16/02/2012, at 7:40 AM, Data Craftsman wrote:
>
> > Hello experts,
> >
> > Based on this blog of Basic Time Series with Cassandra data modeling,
> > http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/
> >
> > "This (wide row column slicing) works well enough for a while, but over
> time, this row will get very large. If you are storing sensor data that
> updates hundreds of times per second, that row will quickly become gigantic
> and unusable. The answer to that is to shard the data up in some way"
> >
> > There is a limit on how big the row size can be before slowing down the
> update and query performance, that is 10MB or less.
> >
> > Is this still true in Cassandra latest version? or in what release
> Cassandra will remove this limit?
> >
> > Manually sharding the wide row will increase the application complexity,
> it would be better if Cassandra can handle it transparently.
> >
> > Thanks,
> > Charlie | DBA & Developer
> >
> >
> > p.s. Quora link,
> >
> http://www.quora.com/Cassandra-database/What-are-good-ways-to-design-data-model-in-Cassandra-for-historical-data
> >
> >
> >
>
>

Mime
View raw message