incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edmond Lau <edm...@ooyala.com>
Subject Re: repeated timeouts on quorum reads
Date Thu, 22 Oct 2009 23:30:10 GMT
Thanks for the help Jonathan.  Given that the current implementation
isn't optimized for large supercolumns and given that the current
thrift api doesn't support slicing a set of columns across multiple
supercolumns of the same row anyway, I agree that I'd be better off
just folding my supercolumns into separate row keys.

That's actually what my colleague has been doing for his HBase data
model since HBase doesn't have supercolumns; we're currently
evaluating Cassandra and HBase to see which one we should
productionize.

Edmond

On Thu, Oct 22, 2009 at 3:34 PM, Jonathan Ellis <jbellis@gmail.com> wrote:
> Okay, so the fundamental problem is that deserializing a supercolumn
> with 30k subcolumns is really really slow. (Like we say on
> http://wiki.apache.org/cassandra/CassandraLimitations, "avoid a data
> model that requires large numbers of subcolumns.")
>
> But we were also being needlessly inefficient after deserialization;
> I've attached a patch (against trunk) to
> https://issues.apache.org/jira/browse/CASSANDRA-510.  This gives a
> 30-50% improvement in my tests.
>
> You're looking for more like an order of magnitude improvement though,
> so I would say splitting each supercolumn off into its own row is
> probably the way to go.
>
> -Jonathan
>

Mime
View raw message