Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 81144 invoked from network); 20 Jul 2010 20:18:05 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 20 Jul 2010 20:18:05 -0000 Received: (qmail 34845 invoked by uid 500); 20 Jul 2010 20:18:03 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 34804 invoked by uid 500); 20 Jul 2010 20:18:03 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 34796 invoked by uid 99); 20 Jul 2010 20:18:02 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Jul 2010 20:18:02 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dathanvp@gmail.com designates 74.125.83.194 as permitted sender) Received: from [74.125.83.194] (HELO mail-pv0-f194.google.com) (74.125.83.194) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Jul 2010 20:17:56 +0000 Received: by pvc7 with SMTP id 7so2441802pvc.1 for ; Tue, 20 Jul 2010 13:17:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=TWnT2ux9tF3hFXfptYgzsDIVNRmy6hQK28rAV6wZF8g=; b=qug+tBYqn/jtxANGBp7tLUstEIDQDRzMMHHdvanNFUFdj1oitz1g6FfU20xWC0DNCu EO2Cpm8WgEhYKrkKhps3WiwDczTI7jttyQ5a7JB6nlut3DN+gyGLw6JjkzNL4rFr7YMW 3DVCgrAbiYNdSuC79ywipTm0/Z5TiHWTVEJbs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=m7YzkmkVutrrNHkJw3BOWj+zRo6MS8JqNavQKr7CKDHo4NC+YM/UcjCEpVxR/BQ4aW J6pu6+AQl5MFJG1pZgTWeAuATdMuVIP4I4IWvaT0/fossCku62uKuT1gkTc3uLbyBjzr r3S/UUD8GQjNVcgQ1TPd3ga+FjpUThBX+6OZ0= MIME-Version: 1.0 Received: by 10.142.157.6 with SMTP id f6mr10109491wfe.95.1279657055151; Tue, 20 Jul 2010 13:17:35 -0700 (PDT) Received: by 10.142.73.14 with HTTP; Tue, 20 Jul 2010 13:17:35 -0700 (PDT) In-Reply-To: References: Date: Tue, 20 Jul 2010 13:17:35 -0700 Message-ID: Subject: Re: Ran into an issue where Cassandra Crashed when running out of heap space From: Dathan Pattishall To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org The storage structure is rather simple. For every 1 key there is 1 column and a timestamp for that column. We don't enable pulling a huge amount of data and all other nodes are up servicing the same request. I suspect there may be another problem with Memory management inside Cassandra. Attaching Jconsole shows that there is a growth of memory and weird spikes. Unfortunately I did not take a screen shot of the growth of the spike over time. I'll do that when it occurs again. On Tue, Jul 20, 2010 at 1:05 PM, Tristan Seligmann wrote: > On Tue, Jul 20, 2010 at 9:09 PM, Peter Schuller > wrote: >>> CassandraDaemon.java (line 83) Uncaught exception in thread >>> Thread[pool-1-thread-37895,5,main] >>> java.lang.OutOfMemoryError: Java heap space >>> =A0=A0=A0=A0=A0=A0=A0 at org.apache.thrift.protocol.TBinaryProtocol.rea= dStringBody(TBinaryProtocol.java:296) >>> =A0=A0=A0=A0=A0=A0=A0 at org.apache.thrift.protocol.TBinaryProtocol.rea= dMessageBegin(TBinaryProtocol.java:203) >>> =A0=A0=A0=A0=A0=A0=A0 at org.apache.cassandra.thrift.Cassandra$Processo= r.process(Cassandra.java:1116) >>> =A0=A0=A0=A0=A0=A0=A0 at org.apache.cassandra.thrift.CustomTThreadPoolS= erver$WorkerProcess.run(CustomTThreadPoolServer.java:167) >>> =A0=A0=A0=A0=A0=A0=A0 at java.util.concurrent.ThreadPoolExecutor$Worker= .runTask(ThreadPoolExecutor.java:886) >>> =A0=A0=A0=A0=A0=A0=A0 at java.util.concurrent.ThreadPoolExecutor$Worker= .run(ThreadPoolExecutor.java:908) >>> =A0=A0=A0=A0=A0=A0=A0 at java.lang.Thread.run(Thread.java:619) >> >> Did someone send garbage on the wrong port, causing thrift to try to >> read some huge string in the RPC layer? There is a bug filed about >> this upstream with thrift but I couldn't find it now. > > In particular, I've seen this happen when using the wrong protocol > (framed / unframed) on the client relative to what the server is > configured for. > -- > mithrandi, i Ainil en-Balandor, a faer Ambar >