cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Speirs <>
Subject Re: Super Slow Multi-gets
Date Thu, 10 Feb 2011 16:55:55 GMT
We attempted a compaction to see if that would improve read
performance (BTW: write performance is as expected, fast!). Here is
the result, an ArrayOutOfBounds exception:

INFO 11:48:41,070 Compacting

ERROR 11:48:41,080 Fatal exception in thread
java.lang.ArrayIndexOutOfBoundsException: 7
        at org.apache.cassandra.db.marshal.TimeUUIDType.compareTimestampBytes(
        at java.util.concurrent.ConcurrentSkipListMap$ComparableUsingComparator.compareTo(
        at java.util.concurrent.ConcurrentSkipListMap.doPut(
        at java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(
        at org.apache.cassandra.db.ColumnFamily.addColumn(
        at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(
        at org.apache.cassandra.utils.ReducingIterator.computeNext(
        at org.apache.commons.collections.iterators.FilterIterator.setNextObject(
        at org.apache.commons.collections.iterators.FilterIterator.hasNext(
        at org.apache.cassandra.db.CompactionManager.doCompaction(
        at org.apache.cassandra.db.CompactionManager$
        at org.apache.cassandra.db.CompactionManager$
        at java.util.concurrent.FutureTask$Sync.innerRun(
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
        at java.util.concurrent.ThreadPoolExecutor$

Does any of that mean anything to anyone?



On Thu, Feb 10, 2011 at 11:00 AM, Bill Speirs <> wrote:
> I have a 7 node setup with a replication factor of 1 and a read
> consistency of 1. I have two column families: Messages which stores
> millions of rows with a UUID for the row key, DateIndex which stores
> thousands of rows with a String as the row key. I perform 2 look-ups
> for my queries:
> 1) Fetch the row from DateIndex that includes the date I'm looking
> for. This returns 1,000 columns where the column names are the UUID of
> the messages
> 2) Do a multi-get (Hector client) using those 1,000 row keys I got
> from the first query.
> Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
> respectable. However, query 2 is taking over 50s to perform 1,000 row
> look-ups! Also, when I scale down to 100 row look-ups for query 2, the
> time scales in a similar fashion, down to 5s.
> Am I doing something wrong here? It seems like taking 5s to look-up
> 100 rows in a distributed hash table is way too slow.
> Thoughts?
> Bill-

View raw message