cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Coli <rc...@eventbrite.com>
Subject Re: large range read in Cassandra
Date Tue, 25 Nov 2014 01:49:04 GMT
On Mon, Nov 24, 2014 at 4:26 PM, Dan Kinder <dkinder@turnitin.com> wrote:

> We have a web crawler project currently based on Cassandra (
> https://github.com/iParadigms/walker, written in Go and using the gocql
> driver), with the following relevant usage pattern:
>
> - Big range reads over a CF to grab potentially millions of rows and
> dispatch new links to crawl
>

If you really mean millions of storage rows, this is just about the worst
case for Cassandra. The problem you're having is probably that you
shouldn't try to do this in Cassandra.

Your timeouts are either from the read actually taking longer than the
timeout or from the reads provoking heap pressure and resulting GC.

=Rob

Mime
View raw message