incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Omitting empty columns from CQL SELECT
Date Sun, 19 Aug 2012 07:55:26 GMT
> there a situation in which that behavior would be useful?
guessing, makes life easier to client implementations and is consistent in the sense that
when doing a slice by name the server is the entity that decides which columns are in the
result set. 
I took a look at the performance of various query techniques here
and . 

If you are selecting by name on wide rows you will eventually see latency increase. This simply
has to do with the amount of data pages that must be read to satisfy the query. Note though
that this is better in 1.X . See slide 61 in the first link and "In Motion - Name Locality"
section in the second. 

Hope that helps. 

Aaron Morton
Freelance Developer

On 17/08/2012, at 10:07 PM, Mat Brown <> wrote:

> Hi Aaron,
> Thanks for the answer. That makes sense and I can see it as a formal
> reason for returning empty columns, but as a practical matter, is
> there a situation in which that behavior would be useful?
> Unfortunately a column slice won't do the trick -- the columns we're
> looking for at any given time wouldn't correspond to a particular
> range; it's essentially "random access".
> For what it's worth, I've managed to make this operation about 30x
> faster in a quick benchmark by just not selecting for specific columns
> at all, and throwing away columns I don't care about in the
> application layer instead. It's unclear whether the performance
> improvements will continue to accrue as the column family becomes more
> densely populated, though.
> Anyway, thanks again!
> Mat
> On Fri, Aug 17, 2012 at 5:06 AM, aaron morton <> wrote:
>> If you specify the columns by name in the select clause the query returns
>> them because they should be projected in the result set.
>> Can you use a column slice instead ?
>> Cheers
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> On 17/08/2012, at 11:09 AM, Mat Brown <> wrote:
>> Hello all,
>> I've noticed that when performing a SELECT statement with a list of
>> columns specified, Cassandra returns all columns in the resulting
>> row(s) even if they have no value. This creates an apparently
>> considerable amount of transport and deserialization overhead,
>> particularly in one use case I'm looking at, in which we select a
>> large collection of columns but expect only a small fraction of them
>> to contain values. Is there any way to get around this and only
>> receive columns that have values in the results?
>> Thanks,
>> Mat

View raw message