incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: read on multiple SS tables
Date Thu, 06 Oct 2011 20:56:18 GMT
> -If you perform a query for a specific row key and a column name, does
> it read the most recent SSTable first and if it finds a hit, does it
> stop there or does it need to read through all the SStables (to find
> most recent one) regardless of whether if found a hit on the most
> recent SSTable or not?
Reads all SSTables, as the only way to know which column instance has the highest time stamp
is to read them all. 

> -  If I perform a slice query on a column range does cassandra iterate
> all the SS tables?
All SSTables that contain any data for the row. 

(background http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/)

> So I am wondering which option would be most efficient from read point of view.
I would go with the first, 64MB columns will be a pain. 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 7/10/2011, at 7:50 AM, Ramesh Natarajan wrote:

> Lets assume I perform frequent insert & update on a column family..
> Over a period of time multiple sstables will have this row/column
> data.
> I have 2 questions about how reads work in cassandra w.r.t. multiple SS tables.
> 
> -If you perform a query for a specific row key and a column name, does
> it read the most recent SSTable first and if it finds a hit, does it
> stop there or does it need to read through all the SStables (to find
> most recent one) regardless of whether if found a hit on the most
> recent SSTable or not?
> 
> -  If I perform a slice query on a column range does cassandra iterate
> all the SS tables?
> 
> We have an option to  create
> 
> 1st option:
> 
> Key1 |  COL1 | COL2 | COL3 .....  <multiple columns >
> 
> We need to perform a slice query to get  COL1-COL3 using key1.
> 
> 2nd option:
> 
> Key1 |  <COL as one column and have application place values of
> COL1-COLN in this one column>
> 
> This key would be updated several times where the app would manage
> adding multiple values to the one column key. Our max col value size
> will be less than 64mb. When you need to search for a value, we would
> read the one column and the application would manage looking up the
> appropriate value in the list of values.
> 
> So I am wondering which option would be most efficient from read point of view.
> 
> thanks
> Ramesh


Mime
View raw message