Dear Bill,

How about the size of the row in the Messages CF. Is it too big? Might you be having an overhead of the bandwidth?

Regards,
Utku

On Thu, Feb 10, 2011 at 5:00 PM, Bill Speirs <bill.speirs@gmail.com> wrote:
I have a 7 node setup with a replication factor of 1 and a read
consistency of 1. I have two column families: Messages which stores
millions of rows with a UUID for the row key, DateIndex which stores
thousands of rows with a String as the row key. I perform 2 look-ups
for my queries:

1) Fetch the row from DateIndex that includes the date I'm looking
for. This returns 1,000 columns where the column names are the UUID of
the messages
2) Do a multi-get (Hector client) using those 1,000 row keys I got
from the first query.

Query 1 is taking ~300ms to fetch 1,000 columns from a single row...
respectable. However, query 2 is taking over 50s to perform 1,000 row
look-ups! Also, when I scale down to 100 row look-ups for query 2, the
time scales in a similar fashion, down to 5s.

Am I doing something wrong here? It seems like taking 5s to look-up
100 rows in a distributed hash table is way too slow.

Thoughts?

Bill-