incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Row vs CF
Date Fri, 24 Apr 2009 17:49:47 GMT
I'm unconvinced on both counts -- that analytics somehow implies that
CF locality is no longer important, and that doing a request per CF
would be unacceptable.

On Fri, Apr 24, 2009 at 12:13 PM, Jun Rao <junrao@almaden.ibm.com> wrote:
> There are definitely cases that you want to read a full row. For example,
> for some batch jobs that do analytics.
>
> In fact, Table used to have a method that reads a full row and it's used in
> test.DataImporter. Apparently, that code is broken now.
>
> Jun
> IBM Almaden Research Center
> K55/B1, 650 Harry Road, San Jose, CA 95120-6099
>
> junrao@almaden.ibm.com
>
> Jonathan Ellis <jbe@familyellis.org>
>
>
> Jonathan Ellis <jbe@familyellis.org>
> Sent by: jbellis@gmail.com
>
> 04/22/2009 08:54 AM
>
> Please respond to
> cassandra-dev@incubator.apache.org
>
> To
> cassandra-dev@incubator.apache.org
> cc
>
> Subject
> Row vs CF
>
> In a bunch of places in the code we wrap a CF in a Row object,
> basically a key + multiple CFs.  But currently only a single
> ColumnFamily will ever be in a Row object.  (At least in the Rows
> involved in a client read op.  Maybe Rows are used internally in other
> places with multiple CFs.  But I am concerned with the read path
> here.)
>
> Is this an example where we should apply YAGNI?
> (http://en.wikipedia.org/wiki/You_Ain%27t_Gonna_Need_It)  It seems to
> me that if the definition of a CF is, "this is data that is logically
> or otherwise related" then adding an API to request multiple CFs at
> once is unnecessary.  (If you really need data from multiple CFs
> frequently, your data model is broken and you should combine the CFs;
> if you need it infrequently, the overhead from doing multiple queries
> is not a big deal.)
>
> Thoughts?
>
> -Jonathan
>
>

Mime
View raw message