I hit this issue again today and looks like changing -Xss option does not work :(
I am on 1.0.11 (I know its old, we are upgrading to 1.2.9 right now) and have about 800-900GB of data. I can see cassandra is spending a lot of time reading the data files before it quits with  "java.lang.OutOfMemoryError: unable to create new native thread" error.

My hard and soft limits seems to be ok as well
Datastax recommends [1]
* soft nofile 32768
* hard nofile 32768

and I have
hard    nofile 65536
soft    nofile 65536

My ulimit -u output is 515038 (which again should be sufficient)

complete output

ulimit -a
core file size          (blocks, -c)                0
data seg size           (kbytes, -d)              unlimited
scheduling priority             (-e)                 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 515038
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 515038
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited




Has anyone run into this ?

[1] http://www.datastax.com/docs/1.1/troubleshooting/index

On Wed, Sep 11, 2013 at 8:47 AM, srmore <comomore@gmail.com> wrote:
Thanks Viktor,


- check (cassandra-env.sh) -Xss size, you may need to increase it for your JVM;

This seems to have done the trick !

Thanks !


On Tue, Sep 10, 2013 at 12:46 AM, Viktor Jevdokimov <Viktor.Jevdokimov@adform.com> wrote:

For start:

- check (cassandra-env.sh) -Xss size, you may need to increase it for your JVM;

- check (cassandra-env.sh) -Xms and -Xmx size, you may need to increase it for your data load/bloom filter/index sizes.

 

 

 

Best regards / Pagarbiai
Viktor Jevdokimov
Senior Developer

Adform News

Visit us at Dmexco: Hall 6 Stand B-52
September 18-19 Cologne, Germany

J. Jasinskio 16C, LT-03163 Vilnius, Lithuania
Follow us on Twitter: @adforminsider
Take a ride with Adform's Rich Media Suite

Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies.

From: srmore [mailto:comomore@gmail.com]
Sent: Tuesday, September 10, 2013 6:16 AM
To: user@cassandra.apache.org
Subject: Error during startup - java.lang.OutOfMemoryError: unable to create new native thread [heur]

 


I have a 5 node cluster with a load of around 300GB each. A node went down and does not come up. I can see the following exception in the logs.

ERROR [main] 2013-09-09 21:50:56,117 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[main,5,main]
java.lang.OutOfMemoryError: unable to create new native thread
        at java.lang.Thread.start0(Native Method)
        at java.lang.Thread.start(Thread.java:640)
        at java.util.concurrent.ThreadPoolExecutor.addIfUnderCorePoolSize(ThreadPoolExecutor.java:703)
        at java.util.concurrent.ThreadPoolExecutor.prestartAllCoreThreads(ThreadPoolExecutor.java:1392)
        at org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor.<init>(JMXEnabledThreadPoolExecutor.java:77)
        at org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor.<init>(JMXEnabledThreadPoolExecutor.java:65)
        at org.apache.cassandra.concurrent.JMXConfigurableThreadPoolExecutor.<init>(JMXConfigurableThreadPoolExecutor.java:34)
        at org.apache.cassandra.concurrent.StageManager.multiThreadedConfigurableStage(StageManager.java:68)
        at org.apache.cassandra.concurrent.StageManager.<clinit>(StageManager.java:42)
        at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:344)
        at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:173)

 

The ulimit -u output is
515042

Which is far more than what is recommended [1] (10240) and I am skeptical to set it to unlimited as recommended here [2]

Any pointers as to what could be the issue and how to get the node up.