cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (Commented) (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3861) get_indexed_slices throws OOM Error when is called with too big indexClause.count
Date Wed, 08 Feb 2012 10:00:59 GMT


Sylvain Lebresne commented on CASSANDRA-3861:

bq. In your example above, the "right" thing to do from a client's perspective is to use a
limit of 10000.

Agreed, but my argument is that if 99% of query returns < 10 rows, our code is uselessly
inefficient for 99% of the queries. I'm really only talking about a performance issue.

bq. I guess I'd be okay with dropping that if we add a special check to return IRE for the
MAX_VALUE antipattern.

I think that forbidding the MAX_VALUE anti-pattern is a different debate, but throwing a IRE
on MAX_VALUE would be very java specific. For users of other languages, the same anti-pattern
would likely be to pass some huge number, but likely not MAX_VALUE exactly. The right solution
moving forward will be to do automatic paging with CQL, but in the meantime I don't see a
good way to protect people against their own mistake that does not incur inefficiency or limitations.
> get_indexed_slices throws OOM Error when is called with too big indexClause.count
> ---------------------------------------------------------------------------------
>                 Key: CASSANDRA-3861
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: API, Core
>    Affects Versions: 1.0.7
>            Reporter: Vladimir Tsanev
>            Assignee: Sylvain Lebresne
>             Fix For: 1.0.8
>         Attachments: 3861.patch
> I tried to call get_index_slices with Integer.MAX_VALUE as IndexClause.count. Unfortunately
the node died with OOM. In the log there si following error:
> ERROR [Thrift:4] 2012-02-06 17:43:39,224 (line 3252) Internal error processing
> java.lang.OutOfMemoryError: Java heap space
> 	at java.util.ArrayList.<init>(
> 	at org.apache.cassandra.service.StorageProxy.scan(
> 	at org.apache.cassandra.thrift.CassandraServer.get_indexed_slices(
> 	at org.apache.cassandra.thrift.Cassandra$Processor$get_indexed_slices.process(
> 	at org.apache.cassandra.thrift.Cassandra$Processor.process(
> 	at org.apache.cassandra.thrift.CustomTThreadPoolServer$
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
> 	at java.util.concurrent.ThreadPoolExecutor$
> 	at
> Is it necessary to allocate all the memory in advance. I only have 3 KEYS that match
my caluse. I do not known the exact number but in general I am sure that they wil fit in the
> I can/will implement some calls with paging, but wanted to test and I am not happy with
the fact the node disconnected.
> I wonder why ArrayList is used here?
> I think the result is never accessed by index (but only iterated) and the subList for
non RandomAccess Lists (for example LinkedList) will do the same job if you are not using
other operations than iteration.
> Is this related to the problem described in CASSANDRA-691.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message