Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 70BEDD632 for ; Thu, 13 Sep 2012 17:00:13 +0000 (UTC) Received: (qmail 32481 invoked by uid 500); 13 Sep 2012 17:00:10 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 32460 invoked by uid 500); 13 Sep 2012 17:00:10 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 32449 invoked by uid 99); 13 Sep 2012 17:00:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Sep 2012 17:00:10 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests=FSL_RCVD_USER,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of brian.jeltema@digitalenvoy.net designates 64.88.168.16 as permitted sender) Received: from [64.88.168.16] (HELO barracuda.digitalenvoy.net) (64.88.168.16) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 13 Sep 2012 17:00:01 +0000 X-ASG-Debug-ID: 1347555579-459ef5b10001-f7dORa Received: from brian-jeltema.employees.digitalenvoy.net ([64.129.218.66]) by barracuda.digitalenvoy.net with ESMTP id Ohr8tFXNPf2jpzAD (version=TLSv1 cipher=AES128-SHA bits=128 verify=NO) for ; Thu, 13 Sep 2012 12:59:39 -0400 (EDT) X-Barracuda-Envelope-From: brian.jeltema@digitalenvoy.net X-Barracuda-Apparent-Source-IP: 64.129.218.66 X-ASG-Whitelist: Client Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1278) Subject: Re: hadoop inserts blow out heap From: Brian Jeltema X-ASG-Orig-Subj: Re: hadoop inserts blow out heap In-Reply-To: <4CB164F1-B76F-47D3-9F1F-7E12D4F1203E@digitalenvoy.net> Date: Thu, 13 Sep 2012 12:59:39 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4CB164F1-B76F-47D3-9F1F-7E12D4F1203E@digitalenvoy.net> To: user@cassandra.apache.org X-Mailer: Apple Mail (2.1278) X-Barracuda-Connect: UNKNOWN[64.129.218.66] X-Barracuda-Start-Time: 1347555579 X-Barracuda-Encrypted: AES128-SHA X-Barracuda-URL: http://barracuda.digitalenvoy.net:8000/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at digitalenvoy.net I didn't get a response to this, so I'll give it another shot. I tweaked = some parameters and cleaned up my schema. My Hadoop/Cassandra job got further, but still dies with an OOM error. = This time, the heap dump displays a JMXConfigurableThradPoolExecutor with a retained heap of 7.5G. I = presume this means that the Hadoop job is writing to Cassandra faster than Cassandra can write to disk. Is = there anything I can do to throttle the job? The Cassandra cluster is set up with default configuration values except for = a reduced memtable size. Forgot to mention this is Cassandra 1.1.2 Thanks in advance. Brian On Sep 12, 2012, at 7:52 AM, Brian Jeltema wrote: > I'm a fairly novice Cassandra/Hadoop guy. I have written a Hadoop job = (using the Cassandra/Hadoop integration API) > that performs a full table scan and attempts to populate a new table = from the results of the map/reduce. The read > works fine and is fast, but the table insertion is failing with OOM = errors (in the Cassandra VM). The resulting heap dump from one node = shows that > 2.9G of the heap is consumed by a JMXConfigurableThreadPoolExecutor = that appears to be full of batch mutations. >=20 > I'm using a 6-node cluster, 32G per node, 8G heap, RF=3D3, if any of = that matters. >=20 > Any suggestions would be appreciated regarding configuration changes = or additional information I might > capture to understand this problem. >=20 > Thanks >=20 > Brian J