incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@almaden.ibm.com>
Subject Re: Row vs CF
Date Fri, 24 Apr 2009 17:13:35 GMT

There are definitely cases that you want to read a full row. For example,
for some batch jobs that do analytics.

In fact, Table used to have a method that reads a full row and it's used in
test.DataImporter. Apparently, that code is broken now.

Jun
IBM Almaden Research Center
K55/B1, 650 Harry Road, San Jose, CA  95120-6099

junrao@almaden.ibm.com



                                                                           
             Jonathan Ellis                                                
             <jbe@familyellis.                                             
             org>                                                       To 
             Sent by:                  cassandra-dev@incubator.apache.org  
             jbellis@gmail.com                                          cc 
                                                                           
                                                                   Subject 
             04/22/2009 08:54          Row vs CF                           
             AM                                                            
                                                                           
                                                                           
             Please respond to                                             
             cassandra-dev@inc                                             
             ubator.apache.org                                             
                                                                           
                                                                           





In a bunch of places in the code we wrap a CF in a Row object,
basically a key + multiple CFs.  But currently only a single
ColumnFamily will ever be in a Row object.  (At least in the Rows
involved in a client read op.  Maybe Rows are used internally in other
places with multiple CFs.  But I am concerned with the read path
here.)

Is this an example where we should apply YAGNI?
(http://en.wikipedia.org/wiki/You_Ain%27t_Gonna_Need_It)  It seems to
me that if the definition of a CF is, "this is data that is logically
or otherwise related" then adding an API to request multiple CFs at
once is unnecessary.  (If you really need data from multiple CFs
frequently, your data model is broken and you should combine the CFs;
if you need it infrequently, the overhead from doing multiple queries
is not a big deal.)

Thoughts?

-Jonathan

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message