incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Haddad <...@jonhaddad.com>
Subject Re: Too many open files (Cassandra 2.0.1)
Date Tue, 29 Oct 2013 16:21:11 GMT
In general, my understanding is that memory mapped files use a lot of open file handlers. 
We raise all our DBs to unlimited open files.

On Oct 29, 2013, at 8:30 AM, Pieter Callewaert <pieter.callewaert@be-mobile.be> wrote:

> Investigated a bit more:
>  
> -        I can reproduce it, happened already on several nodes when I do some stress
testing (50000 select’s spread over multiple threads)
> -        Unexpected exception in the selector loop. Seems not related with the Too many
open files, it just happens.
> -        It’s not socket related.
> -        Using Oracle Java(TM) SE Runtime Environment (build 1.7.0_40-b43)
> -        Using multiple data directories (maybe related ?)
>  
> I’m stuck at the moment, I don’t know If I should try DEBUG log because it will be
too much information?
>  
> Kind regards,
> Pieter Callewaert
>  
> <image001.png>
>    Pieter Callewaert
>    Web & IT engineer
>  
>    Web:   www.be-mobile.be
>    Email: pieter.callewaert@be-mobile.be
>    Tel:  + 32 9 330 51 80
>  
>  
> From: Pieter Callewaert [mailto:pieter.callewaert@be-mobile.be] 
> Sent: dinsdag 29 oktober 2013 13:40
> To: user@cassandra.apache.org
> Subject: Too many open files (Cassandra 2.0.1)
>  
> Hi,
>  
> I’ve noticed some nodes in our cluster are dying after some period of time.
>  
> WARN [New I/O server boss #17] 2013-10-29 12:22:20,725 Slf4JLogger.java (line 76) Failed
to accept a connection.
> java.io.IOException: Too many open files
>         at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>         at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241)
>         at org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:100)
>         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:312)
>         at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:724)
>  
> And other exceptions related to the same cause.
> Now, as we use the Cassandra package, the nofile limit is raised to 100000.
> To double check if this correct:
>  
> root@de-cass09 ~ # cat /proc/18332/limits
> Limit                     Soft Limit           Hard Limit           Units
> …
> Max open files            100000               100000               files
> …
>  
> Now I check how many files are open:
> root@de-cass09 ~ # lsof -n -p 18332 | wc -l
> 100038
>  
> This seems an awful a lot for size tiered compaction… ?
> Now I noticed when I checked the list, a (deleted) file passed a lot
>  
> …
> java    18332 cassandra 4704r   REG                8,1  10911921661 2147483839 /data1/mapdata040/hos/mapdata040-hos-jb-7648-Data.db
(deleted)
> java    18332 cassandra 4705r   REG                8,1  10911921661 2147483839 /data1/mapdata040/hos/mapdata040-hos-jb-7648-Data.db
(deleted)
> …
>  
> Actually, if I count specific for this file:
> root@de-cass09 ~ # lsof -n -p 18332 | grep mapdata040-hos-jb-7648-Data.db | wc -l
> 52707
>  
> Other nodes are around a total of 350 files open… Any idea why this nofiles is so high
?
>  
> The first exceptions I see is this:
> WARN [New I/O worker #8] 2013-10-29 12:09:34,440 Slf4JLogger.java (line 76) Unexpected
exception in the selector loop.
> java.lang.NullPointerException
>         at sun.nio.ch.EPollArrayWrapper.setUpdateEvents(EPollArrayWrapper.java:178)
>         at sun.nio.ch.EPollArrayWrapper.add(EPollArrayWrapper.java:227)
>         at sun.nio.ch.EPollSelectorImpl.implRegister(EPollSelectorImpl.java:164)
>         at sun.nio.ch.SelectorImpl.register(SelectorImpl.java:133)
>         at java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:209)
>         at org.jboss.netty.channel.socket.nio.NioWorker$RegisterTask.run(NioWorker.java:151)
>         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:366)
>         at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:290)
>         at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:90)
>         at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>        at java.lang.Thread.run(Thread.java:724)
>  
> Several minutes later I get Too many open files.
>  
> Specs:
> 12-node cluster with Ubuntu 12.04 LTS, Cassandra 2.0.1 (datastax packages), using JBOD
of 2 disks.
> JNA enabled.
>  
> Any suggestions?
>  
> Kind regards,
> Pieter Callewaert
>  
> <image001.png>
>    Pieter Callewaert
>    Web & IT engineer
>  
>    Web:   www.be-mobile.be
>    Email: pieter.callewaert@be-mobile.be
>    Tel:  + 32 9 330 51 80
>  


Mime
View raw message