Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 852F510D66 for ; Tue, 10 Sep 2013 03:16:10 +0000 (UTC) Received: (qmail 978 invoked by uid 500); 10 Sep 2013 03:16:07 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 960 invoked by uid 500); 10 Sep 2013 03:16:06 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 946 invoked by uid 99); 10 Sep 2013 03:16:05 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Sep 2013 03:16:05 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of comomore@gmail.com designates 209.85.212.43 as permitted sender) Received: from [209.85.212.43] (HELO mail-vb0-f43.google.com) (209.85.212.43) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Sep 2013 03:15:59 +0000 Received: by mail-vb0-f43.google.com with SMTP id h11so4757517vbh.30 for ; Mon, 09 Sep 2013 20:15:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=hxW2w4RyyK2aFUkq2y+7pfaBzfUITpHasRI/Qb5n25g=; b=eHymQ0WB5f8jERWLDZCNEicHmQfia14gnyyj9aUNoAbH6Qn4GUjlZp0HML9bV83s0k 5WvHk/utOW74D1GLif7C4g1wSAKayajTNL76t9WBVID8yr+ZhXw9vVH4vQcexIsmrObu fb0G67wCptNmrpuKKQtvsP71ghwHjyN2OCjkA/n/NL82lqmmKGRK+zrvyLwymNEVxVvQ Egzb/CbxpeW0sVSZfYMohEv2sbrDQSQWtQ2+njPLpd/4CE6PtFzlLBYJdFQe8LfRyXIW Swo4Ia96uIDhi2vWnJke34W8lUl/RWEWCRWmcguT3lHnfynebPCaocMG0eIvN0Ko50l8 AYGg== MIME-Version: 1.0 X-Received: by 10.52.34.40 with SMTP id w8mr17456897vdi.7.1378782938219; Mon, 09 Sep 2013 20:15:38 -0700 (PDT) Received: by 10.220.242.82 with HTTP; Mon, 9 Sep 2013 20:15:38 -0700 (PDT) Date: Mon, 9 Sep 2013 22:15:38 -0500 Message-ID: Subject: Error during startup - java.lang.OutOfMemoryError: unable to create new native thread From: srmore To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=20cf307cfec0414e2e04e5feebe5 X-Virus-Checked: Checked by ClamAV on apache.org --20cf307cfec0414e2e04e5feebe5 Content-Type: text/plain; charset=ISO-8859-1 I have a 5 node cluster with a load of around 300GB each. A node went down and does not come up. I can see the following exception in the logs. ERROR [main] 2013-09-09 21:50:56,117 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[main,5,main] java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0(Native Method) at java.lang.Thread.start(Thread.java:640) at java.util.concurrent.ThreadPoolExecutor.addIfUnderCorePoolSize(ThreadPoolExecutor.java:703) at java.util.concurrent.ThreadPoolExecutor.prestartAllCoreThreads(ThreadPoolExecutor.java:1392) at org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor.(JMXEnabledThreadPoolExecutor.java:77) at org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor.(JMXEnabledThreadPoolExecutor.java:65) at org.apache.cassandra.concurrent.JMXConfigurableThreadPoolExecutor.(JMXConfigurableThreadPoolExecutor.java:34) at org.apache.cassandra.concurrent.StageManager.multiThreadedConfigurableStage(StageManager.java:68) at org.apache.cassandra.concurrent.StageManager.(StageManager.java:42) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:344) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:173) The *ulimit -u* output is *515042* Which is far more than what is recommended [1] (10240) and I am skeptical to set it to unlimited as recommended here [2] Any pointers as to what could be the issue and how to get the node up. [1] http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html?pagename=docs&version=1.2&file=install/recommended_settings#cassandra/install/installRecommendSettings.html [2] http://mail-archives.apache.org/mod_mbox/cassandra-user/201303.mbox/%3CCAPqEvGE474Omea1BFLJ6U_pbAkOwWxk=Dwo35_pc-ATwB4_6iA@mail.gmail.com%3E Thanks ! --20cf307cfec0414e2e04e5feebe5 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

I have a 5 node cluster with = a load of around 300GB each. A node went down and does not come up. I can s= ee the following exception in the logs.

ERROR [main= ] 2013-09-09 21:50:56,117 AbstractCassandraDaemon.java (line 139) Fatal exc= eption in thread Thread[main,5,main]
java.lang.OutOfMemoryError: unable to create new native thread
=A0=A0=A0= =A0=A0=A0=A0 at java.lang.Thread.start0(Native Method)
=A0=A0=A0=A0=A0= =A0=A0 at java.lang.Thread.start(Thread.java:640)
=A0=A0=A0=A0=A0=A0=A0 = at java.util.concurrent.ThreadPoolExecutor.addIfUnderCorePoolSize(ThreadPoo= lExecutor.java:703)
=A0=A0=A0=A0=A0=A0=A0 at java.util.concurrent.ThreadPoolExecutor.prestartAl= lCoreThreads(ThreadPoolExecutor.java:1392)
=A0=A0=A0=A0=A0=A0=A0 at org.= apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor.<init>(JMXEn= abledThreadPoolExecutor.java:77)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.cassandra.concurrent.JMXEnabledThreadPo= olExecutor.<init>(JMXEnabledThreadPoolExecutor.java:65)
=A0=A0=A0= =A0=A0=A0=A0 at=20 org.apache.cassandra.concurrent.JMXConfigurableThreadPoolExecutor.<init&= gt;(JMXConfigurableThreadPoolExecutor.java:34)
=A0=A0=A0=A0=A0=A0=A0 at = org.apache.cassandra.concurrent.StageManager.multiThreadedConfigurableStage= (StageManager.java:68)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.cassandra.concurrent.StageManager.<c= linit>(StageManager.java:42)
=A0=A0=A0=A0=A0=A0=A0 at org.apache.cass= andra.db.commitlog.CommitLog.recover(CommitLog.java:344)
=A0=A0=A0=A0=A0= =A0=A0 at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.jav= a:173)


The ulimit -u output is
515042
Which is far more than what is recommended [1] (10240) and I am ske= ptical to set it to unlimited as recommended here [2]

Any poin= ters as to what could be the issue and how to get the node up.
--20cf307cfec0414e2e04e5feebe5--