Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BD6BD9C7F for ; Fri, 24 Feb 2012 21:15:14 +0000 (UTC) Received: (qmail 77741 invoked by uid 500); 24 Feb 2012 21:15:12 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 77719 invoked by uid 500); 24 Feb 2012 21:15:12 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 77710 invoked by uid 99); 24 Feb 2012 21:15:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Feb 2012 21:15:12 +0000 X-ASF-Spam-Status: No, hits=2.9 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: 98.139.212.167 is neither permitted nor denied by domain of mail4qf@gmail.com) Received: from [98.139.212.167] (HELO nm8.bullet.mail.bf1.yahoo.com) (98.139.212.167) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 24 Feb 2012 21:15:02 +0000 Received: from [98.139.212.149] by nm8.bullet.mail.bf1.yahoo.com with NNFMP; 24 Feb 2012 21:14:41 -0000 Received: from [98.139.212.219] by tm6.bullet.mail.bf1.yahoo.com with NNFMP; 24 Feb 2012 21:14:41 -0000 Received: from [127.0.0.1] by omp1028.mail.bf1.yahoo.com with NNFMP; 24 Feb 2012 21:14:41 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 842852.99650.bm@omp1028.mail.bf1.yahoo.com Received: (qmail 35676 invoked by uid 60001); 24 Feb 2012 21:14:40 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1330118080; bh=SYtomuAa0Rk/QeIeOVw//HTSvx5nk2xZJTM1HAG95ko=; h=X-YMail-OSG:Received:X-RocketYMMF:X-Mailer:Message-ID:Date:From:Reply-To:Subject:To:MIME-Version:Content-Type; b=h9sftIOK/QvhKAoBPCx97qHcI/Zu87iEcdGNMF8Rj1Kzc+bC3YTa8Xgb6pQM+tz09+qUsYipF0dHDY2dTfEDc1E4dGM5rGzFks3YfAh8lu73D+4DFu2klkeqqYr8gvxUYe051PBcOb5HYtdIADhFnyvge2VC6FX3HmL3rjHoxTs= X-YMail-OSG: hh4QtUQVM1lHZGfBJuYoXQq_CGcqoxMnFAmqtBliD5rCTFT W8Pd431uZAccT2vHvtar04NtLyoHsy3V_Fu9Lu7VmzpFwu4RSu47JDw1Qxx7 aZvom0mmo46_TN4pAXctsqdtBglgvxnPPyl9JxDa7ncrtQi_SX_THnBT5KdH uT6.s2Cv2M3aYOP7nvcEJOgdMt5pmI1tZ8D1RAvZsp43w.IPO9ebMPPwEcgm iwyQtORxQEdKmu60wRUFG3EJXtBsmQPGS5KgX29HPn1R.UqjzMJY9anZ78bW g74Fnif91FaQeIs6IC6JdZKCxA_nYWCQrddzdkzdOtMcJBCeD38CfcC7I3U_ ZOsjxsqpLthuKhNCX4l.7.J1kr1wHJqQc923Yb3cFSW2IxWpd3mIa6VOzRRZ eHhcuzqHNLQdXv50IZgPTcKNf.IL7pk_ApY4Rh0lvfW.ainN0BdMo_AX5rXp uOwMQdmRmZLyL83E- Received: from [216.113.168.128] by web31806.mail.mud.yahoo.com via HTTP; Fri, 24 Feb 2012 13:14:40 PST X-RocketYMMF: fengqu X-Mailer: YahooMailWebService/0.8.117.340979 Message-ID: <1330118080.31969.YahooMailNeo@web31806.mail.mud.yahoo.com> Date: Fri, 24 Feb 2012 13:14:40 -0800 (PST) From: Feng Qu Reply-To: Feng Qu Subject: Server crashed due to "OutOfMemoryError: Java heap space" To: Cassandra User Group MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="-1055047407-1537357381-1330118080=:31969" ---1055047407-1537357381-1330118080=:31969 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Hello,=A0=0A=0AWe have a 6-node ring running 0.8.6 on RHEL 6.1. The first n= ode also runs OpsCenter community. This node has crashed few time recently = with "OutOfMemoryError: Java heap space" while several compactions on few 2= 00-300 GB SSTables were running. We are using 8GB Java heap on host with 96= GB RAM.=A0=0A=0AI would appreciate for help to figure out the root cause an= d solution.=0A=A0=0A=0AFeng Qu=0A=0A=0A=A0INFO [GossipTasks:1] 2012-02-22 1= 3:15:59,135 Gossiper.java (line 697) InetAddress /10.89.74.67 is now dead.= =0A=A0INFO [ScheduledTasks:1] 2012-02-22 13:16:12,114 StatusLogger.java (li= ne 65) ReadStage =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 0 =A0 =A0 = =A0 =A0 0 =A0 =A0 =A0 =A0 0=0AERROR [CompactionExecutor:10538] 2012-02-22 1= 3:16:12,115 AbstractCassandraDaemon.java (line 139) Fatal exception in thre= ad Thread[CompactionExecutor:10538,1,=0Amain]=0Ajava.lang.OutOfMemoryError:= Java heap space=0A=A0 =A0 =A0 =A0 at org.apache.cassandra.io.util.Buffered= RandomAccessFile.(BufferedRandomAccessFile.java:123)=0A=A0 =A0 =A0 = =A0 at org.apache.cassandra.io.sstable.SSTableScanner.(SSTableScanner= .java:57)=0A=A0 =A0 =A0 =A0 at org.apache.cassandra.io.sstable.SSTableReade= r.getDirectScanner(SSTableReader.java:664)=0A=A0 =A0 =A0 =A0 at org.apache.= cassandra.db.compaction.CompactionIterator.getCollatingIterator(CompactionI= terator.java:92)=0A=A0 =A0 =A0 =A0 at org.apache.cassandra.db.compaction.Co= mpactionIterator.(CompactionIterator.java:68)=0A=A0 =A0 =A0 =A0 at or= g.apache.cassandra.db.compaction.CompactionManager.doCompactionWithoutSizeE= stimation(CompactionManager.java:553)=0A=A0 =A0 =A0 =A0 at org.apache.cassa= ndra.db.compaction.CompactionManager.doCompaction(CompactionManager.java:50= 7)=0A=A0 =A0 =A0 =A0 at org.apache.cassandra.db.compaction.CompactionManage= r$1.call(CompactionManager.java:142)=0A=A0 =A0 =A0 =A0 at org.apache.cassan= dra.db.compaction.CompactionManager$1.call(CompactionManager.java:108)=0A= =A0 =A0 =A0 =A0 at java.util.concurrent.FutureTask$Sync.innerRun(Unknown So= urce)=0A=A0 =A0 =A0 =A0 at java.util.concurrent.FutureTask.run(Unknown Sour= ce)=0A=A0 =A0 =A0 =A0 at java.util.concurrent.ThreadPoolExecutor$Worker.run= Task(Unknown Source)=0A=A0 =A0 =A0 =A0 at java.util.concurrent.ThreadPoolEx= ecutor$Worker.run(Unknown Source)=0A=A0 =A0 =A0 =A0 at java.lang.Thread.run= (Unknown Source)=0A=A0INFO [GossipTasks:1] 2012-02-22 13:16:12,115 Gossiper= .java (line 697) InetAddress /10.2.128.55 is now dead.=0AERROR [Thread-734]= 2012-02-22 13:16:48,189 AbstractCassandraDaemon.java (line 139) Fatal exce= ption in thread Thread[Thread-734,5,main]=0Ajava.util.concurrent.RejectedEx= ecutionException: ThreadPoolExecutor has shut down=0A=A0 =A0 =A0 =A0 at org= .apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecuti= on(DebuggableThreadPoolExecutor.java:60)=0A=A0 =A0 =A0 =A0 at java.util.con= current.ThreadPoolExecutor.reject(Unknown Source)=0A=A0 =A0 =A0 =A0 at java= .util.concurrent.ThreadPoolExecutor.execute(Unknown Source)=0A=A0 =A0 =A0 = =A0 at org.apache.cassandra.net.MessagingService.receive(MessagingService.j= ava:490)=0A=A0 =A0 =A0 =A0 at org.apache.cassandra.net.IncomingTcpConnectio= n.run(IncomingTcpConnection.java:136)=0AERROR [Thread-68450] 2012-02-22 13:= 16:48,189 AbstractCassandraDaemon.java (line 139) Fatal exception in thread= Thread[Thread-68450,5,main]=0Ajava.util.concurrent.RejectedExecutionExcept= ion: ThreadPoolExecutor has shut down=0A=A0 =A0 =A0 =A0 at org.apache.cassa= ndra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(Debuggable= ThreadPoolExecutor.java:60)=0A=A0 =A0 =A0 =A0 at java.util.concurrent.Threa= dPoolExecutor.reject(Unknown Source)=0A=A0 =A0 =A0 =A0 at java.util.concurr= ent.ThreadPoolExecutor.ensureQueuedTaskHandled(Unknown Source)=0A=A0 =A0 = =A0 =A0 at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)= =0A=A0 =A0 =A0 =A0 at org.apache.cassandra.net.MessagingService.receive(Mes= sagingService.java:490)=0A=A0 =A0 =A0 =A0 at org.apache.cassandra.net.Incom= ingTcpConnection.run(IncomingTcpConnection.java:136)=0AERROR [Thread-731] 2= 012-02-22 13:16:48,189 AbstractCassandraDaemon.java (line 139) Fatal except= ion in thread Thread[Thread-731,5,main]=0Ajava.util.concurrent.RejectedExec= utionException: ThreadPoolExecutor has shut down=0A=A0 =A0 =A0 =A0 at org.a= pache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution= (DebuggableThreadPoolExecutor.java:60)=0A=A0 =A0 =A0 =A0 at java.util.concu= rrent.ThreadPoolExecutor.reject(Unknown Source)=0A=A0 =A0 =A0 =A0 at java.u= til.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(Unknown Source)= =0A=A0 =A0 =A0 =A0 at java.util.concurrent.ThreadPoolExecutor.execute(Unkno= wn Source)=0A=A0 =A0 =A0 =A0 at org.apache.cassandra.net.MessagingService.r= eceive(MessagingService.java:490)=0A=A0 =A0 =A0 =A0 at org.apache.cassandra= .net.IncomingTcpConnection.run(IncomingTcpConnection.java:136)=0AERROR [Thr= ead-736] 2012-02-22 13:16:48,186 AbstractCassandraDaemon.java (line 139) Fa= tal exception in thread Thread[Thread-736,5,main]=0Ajava.util.concurrent.Re= jectedExecutionException: ThreadPoolExecutor has shut down=0A=A0 =A0 =A0 = =A0 at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejec= tedExecution(DebuggableThreadPoolExecutor.java:60)=0A=A0 =A0 =A0 =A0 at jav= a.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)=0A=A0 =A0 =A0 = =A0 at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)=0A= =A0 =A0 =A0 =A0 at org.apache.cassandra.net.MessagingService.receive(Messag= ingService.java:490)=0A=A0 =A0 =A0 =A0 at org.apache.cassandra.net.Incoming= TcpConnection.run(IncomingTcpConnection.java:136)=0AERROR [Thread-723] 2012= -02-22 13:16:47,746 AbstractCassandraDaemon.java (line 139) Fatal exception= in thread Thread[Thread-723,5,main]=0Ajava.util.concurrent.RejectedExecuti= onException: ThreadPoolExecutor has shut down=0A=A0 =A0 =A0 =A0 at org.apac= he.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(De= buggableThreadPoolExecutor.java:60)=0A=A0 =A0 =A0 =A0 at java.util.concurre= nt.ThreadPoolExecutor.reject(Unknown Source)=0A=A0 =A0 =A0 =A0 at java.util= .concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(Unknown Source)=0A= =A0 =A0 =A0 =A0 at java.util.concurrent.ThreadPoolExecutor.execute(Unknown = Source)=0A=A0 =A0 =A0 =A0 at org.apache.cassandra.net.MessagingService.rece= ive(MessagingService.java:490)=0A ---1055047407-1537357381-1330118080=:31969 Content-Type: text/html; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable
Hello, 
=

We have a 6-node ring running 0.8.6 on RHEL = 6.1. The first node also runs OpsCenter community. This node has crashed fe= w time recently with "OutOfMemoryError: Java heap space" while several comp= actions on few 200-300 GB SSTables were running. We are using 8GB Java heap= on host with 96GB RAM. 

I = would appreciate for help to figure out the root cause and solution.
 
=
Feng Qu


 = INFO [GossipTasks:1] 2012-02-22 13:15:59,135 Gossiper.java (line 697) InetA= ddress /10.89.74.67 is now dead.
 IN= FO [ScheduledTasks:1] 2012-02-22 13:16:12,114 StatusLogger.java (line 65) R= eadStage                   &nb= sp;     0         0       &nbs= p; 0
ERROR [CompactionExecutor:10538] 201= 2-02-22 13:16:12,115 AbstractCassandraDaemon.java (line 139) Fatal exceptio= n in thread Thread[CompactionExecutor:10538,1,
main]
java.lang.OutOfMemoryError: Java = heap space
        at= org.apache.cassandra.io.util.BufferedRandomAccessFile.<init>(Buffere= dRandomAccessFile.java:123)
   =     at org.apache.cassandra.io.sstable.SSTableScanner.<init&g= t;(SSTableScanner.java:57)
    =     at org.apache.cassandra.io.sstable.SSTableReader.getDirectSca= nner(SSTableReader.java:664)
   = ;     at org.apache.cassandra.db.compaction.CompactionIterator.ge= tCollatingIterator(CompactionIterator.java:92)
        at org.apache.cassandra.db.compaction.Co= mpactionIterator.<init>(CompactionIterator.java:68)
=         at org.apache.cassandra.db.compacti= on.CompactionManager.doCompactionWithoutSizeEstimation(CompactionManager.ja= va:553)
        at or= g.apache.cassandra.db.compaction.CompactionManager.doCompaction(CompactionM= anager.java:507)
      &nb= sp; at org.apache.cassandra.db.compaction.CompactionManager$1.call(Compacti= onManager.java:142)
      =   at org.apache.cassandra.db.compaction.CompactionManager$1.call(Compa= ctionManager.java:108)
    &nbs= p;   at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)<= /font>
        at java.util.= concurrent.FutureTask.run(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
        at java.util.con= current.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown So= urce)
 INFO [GossipTasks:1] 2012-02-= 22 13:16:12,115 Gossiper.java (line 697) InetAddress /10.2.128.55 is now de= ad.
ERROR [Thread-734] 2012-02-22 13:16:4= 8,189 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thr= ead[Thread-734,5,main]
java.util.concurre= nt.RejectedExecutionException: ThreadPoolExecutor has shut down
        at org.apache.cassandra.= concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThrea= dPoolExecutor.java:60)
    &nbs= p;   at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
        at java.util.c= oncurrent.ThreadPoolExecutor.execute(Unknown Source)
        at org.apache.cassandra.net.Messagi= ngService.receive(MessagingService.java:490)
        at org.apache.cassandra.net.IncomingTc= pConnection.run(IncomingTcpConnection.java:136)
ERROR [Thread-684= 50] 2012-02-22 13:16:48,189 AbstractCassandraDaemon.java (line 139) Fatal e= xception in thread Thread[Thread-68450,5,main]
java.util.concurre= nt.RejectedExecutionException: ThreadPoolExecutor has shut down
&= nbsp;       at org.apache.cassandra.concurrent.DebuggableThr= eadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60)
        at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
&= nbsp;       at java.util.concurrent.ThreadPoolExecutor.ensur= eQueuedTaskHandled(Unknown Source)
        at= java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
=         at org.apache.cassandra.net.MessagingService.re= ceive(MessagingService.java:490)
        at o= rg.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.jav= a:136)
ERROR [Thread-731] 2012-02-22 13:16:48,189 AbstractCassand= raDaemon.java (line 139) Fatal exception in thread Thread[Thread-731,5,main= ]
java.util.concurrent.RejectedExecutionException: ThreadPoolExec= utor has shut down
        at org.apache.cass= andra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(Debuggabl= eThreadPoolExecutor.java:60)
        at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.en= sureQueuedTaskHandled(Unknown Source)
       = at java.util.concurrent.ThreadPoolExecutor.execute(Unknown Source)
        at org.apache.cassandra.net.MessagingService= .receive(MessagingService.java:490)
        a= t org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.= java:136)
ERROR [Thread-736] 2012-02-22 13:16:48,186 AbstractCass= andraDaemon.java (line 139) Fatal exception in thread Thread[Thread-736,5,m= ain]
java.util.concurrent.RejectedExecutionException: ThreadPoolE= xecutor has shut down
        at org.apache.c= assandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(Debugg= ableThreadPoolExecutor.java:60)
        at java.util.concurrent.ThreadPoolExecutor.reject(Unknown Source)
        at java.util.concurrent.ThreadPoolExec= utor.execute(Unknown Source)
        at org.a= pache.cassandra.net.MessagingService.receive(MessagingService.java:490)
        at org.apache.cassandra.net.IncomingTcpC= onnection.run(IncomingTcpConnection.java:136)
ERROR [Thread-723] = 2012-02-22 13:16:47,746 AbstractCassandraDaemon.java (line 139) Fatal excep= tion in thread Thread[Thread-723,5,main]
java.util.concurrent.Rej= ectedExecutionException: ThreadPoolExecutor has shut down
  =       at org.apache.cassandra.concurrent.DebuggableThreadPoo= lExecutor$1.rejectedExecution(DebuggableThreadPoolExecutor.java:60)
        at java.util.concurrent.ThreadPoolExecutor.r= eject(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor.ensureQueuedTaskHandled(Unknown So= urce)
        at java.util.concurrent.ThreadP= oolExecutor.execute(Unknown Source)
        a= t org.apache.cassandra.net.MessagingService.receive(MessagingService.java:4= 90)

---1055047407-1537357381-1330118080=:31969--