incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Speirs <bill.spe...@gmail.com>
Subject Re: Super Slow Multi-gets
Date Thu, 10 Feb 2011 17:53:35 GMT
Each message row is well under 1K. So I don't think it is network... plus
all boxes are on a fast LAN.

Bill-

On Feb 10, 2011 11:59 AM, "Utku Can Top├žu" <utku@topcu.gen.tr> wrote:
> Dear Bill,
>
> How about the size of the row in the Messages CF. Is it too big? Might you
> be having an overhead of the bandwidth?
>
> Regards,
> Utku
>
> On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs <bill.speirs@gmail.com>
wrote:
>
>> I have a 7 node setup with a replication factor of 1 and a read
>> consistency of 1. I have two column families: Messages which stores
>> millions of rows with a UUID for the row key, DateIndex which stores
>> thousands of rows with a String as the row key. I perform 2 look-ups
>> for my queries:
>>
>> 1) Fetch the row from DateIndex that includes the date I'm looking
>> for. This returns 1,000 columns where the column names are the UUID of
>> the messages
>> 2) Do a multi-get (Hector client) using those 1,000 row keys I got
>> from the first query.
>>
>> Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
>> respectable. However, query 2 is taking over 50s to perform 1,000 row
>> look-ups! Also, when I scale down to 100 row look-ups for query 2, the
>> time scales in a similar fashion, down to 5s.
>>
>> Am I doing something wrong here? It seems like taking 5s to look-up
>> 100 rows in a distributed hash table is way too slow.
>>
>> Thoughts?
>>
>> Bill-
>>

Mime
View raw message