incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Héctor Izquierdo Seliva <izquie...@strands.com>
Subject Re: millions of columns in a row vs millions of rows with one column
Date Tue, 22 Feb 2011 07:49:17 GMT
El mar, 22-02-2011 a las 08:49 +1300, Aaron Morton escribió:
> My preference is to go with more rows as it distributes load better. But the best design
is the one that supports your read patterns.
> 
> See http://wiki.apache.org/cassandra/LargeDataSetConsiderations for background.
> 
> Aaron
> 

those rows are distributed among the three replicas, so my thought was
that I could get away with it and have a more or less balanced cluster.
Anyway, the columns I read are not contiguous, so then the effect in I/O
is the same as having individual rows right? Cassandra still has to seek
to the position of the columns within the row. 

How much space does the key cache uses per row? This would make the
number of rows increase by a big factor.

> On 22/02/2011, at 3:56 AM, Héctor Izquierdo Seliva <izquierdo@strands.com> wrote:
> 
> > Hi Everyone.
> > 
> > I'm testing performance differences of millions of columns in a row vs
> > millions of rows. So far it seems wide rows perform better in terms of
> > reads, but there can be potentially hundreds of millions of columns in a
> > row. Is this going to be a problem? Should I go with individual rows? I
> > run 6 nodes with 7.2 and a RF=3.
> > 
> > Thanks for your help!
> > 



Mime
View raw message