cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Sanford <psanf...@retailnext.net>
Subject Re: Large number of row keys in query kills cluster
Date Wed, 11 Jun 2014 23:34:47 GMT
On Wed, Jun 11, 2014 at 10:12 AM, Jeremy Jongsma <jeremy@barchart.com>
wrote:

> The big problem seems to have been requesting a large number of row keys
> combined with a large number of named columns in a query. 20K rows with 20K
> columns destroyed my cluster. Splitting it into slices of 100 sequential
> queries fixed the performance issue.
>
> When updating 20K rows at a time, I saw a different issue -
> BrokenPipeException from all nodes. Splitting into slices of 1000 fixed
> that issue.
>
> Is there any documentation on this? Obviously these limits will vary by
> cluster capacity, but for new users it would be great to know that you can
> run into problems with large queries, and how they present themselves when
> you hit them. The errors I saw are pretty opaque, and took me a couple days
> to track down.
>
>
The first thing that comes to mind is the Multiget section on the Datastax
anti-patterns page:
http://www.datastax.com/documentation/cassandra/1.2/cassandra/architecture/architecturePlanningAntiPatterns_c.html?scroll=concept_ds_emm_hwl_fk__multiple-gets



-psanford

Mime
View raw message