the cql scripts and the thread dump attached.

Thanks in advance.


On Mon, Jul 1, 2013 at 7:41 PM, Mohica Jasha <mohica.jasha@gmail.com> wrote:
Hey,

I created a table with a wide row. Query on the wide row after removing the entries and flushing the table becomes very slow. I am aware of the impact of tombstones but it seems that there is a deadlock which prevents the query to be completed.

step by step:

1. creating the keyspace and the table:

CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'};
use test;
CREATE TABLE job_index (   stage text,   "timestamp" text,   PRIMARY KEY (stage, "timestamp") ) WITH gc_grace_seconds=10 AND compaction={'sstable_size_in_mb': '10', 'class': 'LeveledCompactionStrategy'};

2. insert 5000 entries to the job_index column family using the attached script (insert_1-5000.cql)

3. flushing the table:
nodetool flush test job_index

4. delete the 5000 entries in the wide row using the attached script (delete_1-5000.cql)

so far the queries return all the entries in the wide row in a fraction of a second.

5. flushing the table:
nodetool flush test job_index

6. run the following query:
cqlsh:test> SELECT * from job_index limit 1 ;
Request did not complete within rpc_timeout.

The execution of the query gets blocked and eventually the query times out.

In the cassandra's log file I see the following lines:

DEBUG [ScheduledTasks:1] 2013-07-01 19:10:39,469 GCInspector.java (line 121) GC for ParNew: 16 ms for 5 collections, 754590496 used; max is 2093809664
DEBUG [ScheduledTasks:1] 2013-07-01 19:10:40,473 GCInspector.java (line 121) GC for ParNew: 19 ms for 6 collections, 547894840 used; max is 2093809664
DEBUG [ScheduledTasks:1] 2013-07-01 19:10:41,475 GCInspector.java (line 121) GC for ParNew: 16 ms for 5 collections, 771812864 used; max is 2093809664

A few minutes later after the compaction finishes the problem goes away.

I am using cassandra 1.2.6.
I tested on Linux (CentOS) and MacOS and I get the same result!

Is this a known issue?