Lets say we want to read all the data which is between 10 minutes old upto 60 minute old. If the data is stored from old to new in an sstable, then we have to go over all the tombstones before we get any column which is live. All the lazy iterators on the column will start with giving columns which are 60 minutes old, 59 minutes old and so on. They all will keep getting tombstones and we will not find any live column till we reach 11 or 12 minute. SO this way we have to go over all the data and tombstones between 60 and 12(if non deleted columns are found at 12 minute).
Whereas, if we store the data from new to old, when we iterate over columns, we will get newer columns first which will have been tombstones and we will find live columns which we can return.
But if there is less columns than we want, then the way we store data does not matter. Because we anyway have to go over all the columns from 10 to 60 minutes.
On Wed, Jul 3, 2013 at 6:02 AM, Hiller, Dean <Dean.Hiller@nrel.gov> wrote:http://thelastpickle.com/2011/10/03/Reverse-Comparators/
> We loaded 5 million columns into a single row and when accessing the first 30k and last 30k columns we saw no performance difference. We tried just loading 2 rows from the beginning and end and saw no performance difference. I am sure reverse sort is there for a reason though. In what context do you actually see a performance difference with reverse sort???
When a query does not specify a start column (and does not specify reversed) the server can just start reading columns from the start without having to worry about finding the right place to start. This is exactly what we can do for the Descending CF.
For the regular Ascending CF we need to specify reversed, so the server must read the row index and work out which column is column count from the end of the row.
There is no comparison really.