cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Amygdalidis (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-2707) Cassandra throws an exception when querying a very large dataset.
Date Mon, 30 May 2011 15:30:47 GMT


Michael Amygdalidis commented on CASSANDRA-2707:

After numerous attempts, I've been unable to get nodetool scrub to run on the bad nodes. 
It usually crashes due to a "too many files open" or Java OOM.  (The disk load is below 50%,
so I'm not sure as to the cause.)

The nodes which did NOT show the exception for the query were able to complete nodetool scrub
with no errors.

> Cassandra throws an exception when querying a very large dataset.
> -----------------------------------------------------------------
>                 Key: CASSANDRA-2707
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.8 beta 1
>         Environment: Eight cassandra instances, with a replication factor of three. 

> DB is running on EC2, all machines are in the same availability zone.  
> All machines are m1.xlarge, under 70% disk usage for the cassandra data drive, and with
16G of RAM.
> java version "1.6.0_04"
> Java(TM) SE Runtime Environment (build 1.6.0_04-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 10.0-b19, mixed mode)
>            Reporter: Michael Amygdalidis
> Cassandra reliably throws a runtime exception (without terminating) when querying a very
large dataset.
> The cluster performs just fine in normal situations with data sets of 10,000 or so. 
However, when querying a column family through either fauna/cassandra or through CLI for all
of the values matching a certain key, with a limit of 100, the following exception is thrown.
> ERROR [ReadStage:126] 2011-05-25 14:14:46,260 (line 113)
Fatal exception in thread Thread[ReadStage:126,5,main]
> java.lang.RuntimeException: Corrupt (negative) value length encountered
> 	at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(
> 	at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(
> 	at
> 	at
> 	at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(
> 	at org.apache.commons.collections.iterators.CollatingIterator.set(
> 	at org.apache.commons.collections.iterators.CollatingIterator.least(
> 	at
> 	at org.apache.cassandra.utils.ReducingIterator.computeNext(
> 	at
> 	at
> 	at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(
> 	at org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(
> 	at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(
> 	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(
> 	at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(
> 	at org.apache.cassandra.db.Table.getRow(
> 	at org.apache.cassandra.db.SliceFromReadCommand.getRow(
> 	at org.apache.cassandra.db.ReadVerbHandler.doVerb(
> 	at
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
> 	at java.util.concurrent.ThreadPoolExecutor$
> 	at
> Caused by: Corrupt (negative) value length encountered
> 	at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(
> 	at org.apache.cassandra.db.ColumnSerializer.deserialize(
> 	at org.apache.cassandra.db.ColumnSerializer.deserialize(
> 	at org.apache.cassandra.db.ColumnSerializer.deserialize(
> 	at org.apache.cassandra.db.ColumnSerializer.deserialize(
> 	at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(
> 	at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(
> 	... 22 more
> Additional Info:
> * I can confirm that the same exception is reliably thrown on three instances at about
the same as the query is executed.
> * The timeout for a remote procedure call is between nodes is 10 seconds, which is about
the time it takes for the query to respond with null.
> * Asking for forward or reverse search does not affect results, however, in production
we'd need to do a reverse search.
> Steps to Reproduce:
> Have a column family with at least 100 million values, including at least 30 million
with the same key.  Try to get 100 items of a given key from that column family.
> Expected behaviour:  To get back the 100 items we queried for, which is what happens
when the number of items under a given key is not so large.  The unexpected behaviour only
manifests itself when the number of possible items is extremely large.

This message is automatically generated by JIRA.
For more information on JIRA, see:

View raw message