cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henning Kropp <>
Subject AW: Re: Request Timeout with Composite Columns and CQL3
Date Thu, 28 Jun 2012 16:43:16 GMT
Well, a CQL import of the same data did not result in any issues. I was not able to rule out
hector yet, but it's more likely that the hadoop BulkOutputFormat causes the trouble.

To rule out hector I'll have to implement the import without the BulkOutputFormat as I did
with CQL.

I would like to use the BulkOutputFormat so. Is it likely to cause the below exception? If
so, why? Can it be fixed?


Am 26.06.2012 17:02 schrieb Sylvain Lebresne <>:
On Tue, Jun 26, 2012 at 4:00 PM, Henning Kropp <> wrote:
> Thanks for the reply. Should have thought about looking into the log files sooner. An
AssertionError happens at execution. I haven't figured out yet why. Any input is very much
> ERROR [ReadStage:1] 2012-06-26 15:49:54,481 (line 134) Exception
in thread Thread[ReadStage:1,5,main]
> java.lang.AssertionError: Added column does not sort as the last column
>        at org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(
>        at org.apache.cassandra.db.AbstractColumnContainer.addColumn(
>        at org.apache.cassandra.db.AbstractColumnContainer.addColumn(
>        at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(
>        at org.apache.cassandra.db.filter.QueryFilter.collateColumns(
>        at org.apache.cassandra.db.CollationController.collectAllData(
>        at org.apache.cassandra.db.CollationController.getTopLevelColumns(
>        at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(
>        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(
>        at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(
>        at org.apache.cassandra.db.Table.getRow(
>        at org.apache.cassandra.db.SliceFromReadCommand.getRow(
>        at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(
>        at org.apache.cassandra.service.StorageProxy$
>        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
>        at java.util.concurrent.ThreadPoolExecutor$
>        at

Obviously that shouldn't happen. You didn't happen to change the
comparator for the column family or something like that from the
hector side?
Are you able to reproduce from a blank DB?


> BTW: I really would love to understand as of why the combined comparator will not allow
two ranges be specified for two key parts. Obviously I still lack a profound understanding
of cassandras architecture to have a clue.
> And while client side filtering might seem like a valid option I am still trying to get
might head around a cassandra data model that would allow this.
> best regards
> ________________________________________
> Von: Sylvain Lebresne []
> Gesendet: Dienstag, 26. Juni 2012 10:21
> Bis:
> Betreff: Re: Request Timeout with Composite Columns and CQL3
> On Mon, Jun 25, 2012 at 11:10 PM, Henning Kropp <> wrote:
>> Hi,
>> I am running into timeout issues using composite columns in cassandra 1.1.1
>> and cql 3.
>> My keyspace and table is defined as the following:
>> create keyspace bn_logs
>>     with strategy_options = [{replication_factor:1}]
>>     and placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy';
>> CREATE TABLE logs (
>>   id text,
>>   ref text,
>>   time bigint,
>>   datum text,
>>   PRIMARY KEY(id, ref, time)
>> );
>> I import some data to the table by using a combination of the thrift
>> interface and the hector Composite.class by using its serialization as the
>> column name:
>> Column col = new Column(composite.serialize());
>> This all seems to work fine until I try to execute the following query which
>> leads to a request timeout:
>> SELECT datum FROM logs WHERE id='861' and ref = 'raaf' and time > '3000';
> If it timeouts the likely reason is that this query selects more data
> than the machine is able to fetch before the timeout. You can either
> add a limit to the query, or increase the timeout.
> If that doesn't seem to fix it, it might be worth checking the server
> log to see if there isn't an error.
>> I really would like to figure out, why running this query on my laptop
>> (single node, for development) will not finish. I also would like to know if
>> the following query would actually work
>> SELECT datum FROM logs WHERE id='861' and ref = 'raaf*' and time > '3000';
> It won't. You can perform the following query:
> SELECT datum FROM logs WHERE id='861' and ref = 'raaf';
> which will select every datum whose ref starts with 'raaf', but then
> you cannot restrict
> the time parameter, so you will get ref where the time is <= 3000. Of
> course you can
> always filter client side if that is an option.
>> or how else there is a way to define a range for the second component of the
>> column key?
> As described above, you can define a range on the second component, but then you
> won't be able to restrict on the 3rd component.
>> Any thoughts?
>> Thanks in advance and kind regards
>> Henning

View raw message