Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8FC5D9E43 for ; Tue, 7 Feb 2012 14:43:21 +0000 (UTC) Received: (qmail 79847 invoked by uid 500); 7 Feb 2012 14:43:21 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 79796 invoked by uid 500); 7 Feb 2012 14:43:20 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 79788 invoked by uid 99); 7 Feb 2012 14:43:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Feb 2012 14:43:20 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Feb 2012 14:43:19 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 508951A7A2D for ; Tue, 7 Feb 2012 14:42:59 +0000 (UTC) Date: Tue, 7 Feb 2012 14:42:59 +0000 (UTC) From: "Sylvain Lebresne (Commented) (JIRA)" To: commits@cassandra.apache.org Message-ID: <1507048279.8649.1328625779331.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <709167068.2449.1328545323960.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (CASSANDRA-3861) get_indexed_slices throws OOM Error when is called with too big indexClause.count MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202433#comment-13202433 ] Sylvain Lebresne commented on CASSANDRA-3861: --------------------------------------------- But what if your index are generally like ~10 rows, but you know some very rare ones can be bigger, yet still fit reasonably in memory. Typically it's hard for you to set a limit less than say 10000 (maybe the rows are skinny) for the query, but 99% of the query will be <= 10 rows. It's then a waste of memory to always allocate this 10000 entries. Don't get me wrong, I agree that people should implement paging as soon as they have a doubt that a given index could return an unbounded number of rows, but it feels weird to arbitrary force people to pretty much *always* implements paging, and to make bigger allocation than necessary in any case. As for protecting people when going from testing to production, if someone does do a bad estimation of the maximum size of a given indexed row, then you can be sure it's code don't handle paging (since he was sure he knew the indexed row couldn't be that big), and in that case I'd rather OOM (i.e. indicating the user made a mistake) than silently return what is a wrong result (for the application). > get_indexed_slices throws OOM Error when is called with too big indexClause.count > --------------------------------------------------------------------------------- > > Key: CASSANDRA-3861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3861 > Project: Cassandra > Issue Type: Bug > Components: API, Core > Affects Versions: 1.0.7 > Reporter: Vladimir Tsanev > Assignee: Sylvain Lebresne > Fix For: 1.0.8 > > Attachments: 3861.patch > > > I tried to call get_index_slices with Integer.MAX_VALUE as IndexClause.count. Unfortunately the node died with OOM. In the log there si following error: > ERROR [Thrift:4] 2012-02-06 17:43:39,224 Cassandra.java (line 3252) Internal error processing get_indexed_slices > java.lang.OutOfMemoryError: Java heap space > at java.util.ArrayList.(ArrayList.java:112) > at org.apache.cassandra.service.StorageProxy.scan(StorageProxy.java:1067) > at org.apache.cassandra.thrift.CassandraServer.get_indexed_slices(CassandraServer.java:746) > at org.apache.cassandra.thrift.Cassandra$Processor$get_indexed_slices.process(Cassandra.java:3244) > at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) > at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Is it necessary to allocate all the memory in advance. I only have 3 KEYS that match my caluse. I do not known the exact number but in general I am sure that they wil fit in the memory. > I can/will implement some calls with paging, but wanted to test and I am not happy with the fact the node disconnected. > I wonder why ArrayList is used here? > I think the result is never accessed by index (but only iterated) and the subList for non RandomAccess Lists (for example LinkedList) will do the same job if you are not using other operations than iteration. > Is this related to the problem described in CASSANDRA-691. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira