cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josep M. Blanquer (JIRA)" <j...@apache.org>
Subject [jira] Created: (CASSANDRA-1495) Server doesn't seem to close SSTables correctly, and ends up with too many file descriptors open
Date Sat, 11 Sep 2010 00:46:34 GMT
Server doesn't seem to close SSTables correctly, and ends up with too many file descriptors
open
------------------------------------------------------------------------------------------------

                 Key: CASSANDRA-1495
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1495
             Project: Cassandra
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.7 beta 1
         Environment: 0.7 beta1. Built from: http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.7.0/apache-cassandra-0.7.0-beta1-src.tar.gz
using dpkg-buildpackage from it. 
            Reporter: Josep M. Blanquer
             Fix For: 0.7 beta 2


Cassandra server accumulates many open file descriptors as the time progresses. Obviously
when it hits whatever limit the system allows, the service stops accepting messages.

The exact trace is:
WARN 05:41:47,929 Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: java.net.SocketException: Too many open files
        at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124)
        at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:35)
        at org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
        at org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:98)
        at org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:210)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:177)
Caused by: java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
        at java.net.ServerSocket.implAccept(ServerSocket.java:453)
        at java.net.ServerSocket.accept(ServerSocket.java:421)
        at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)
        ... 9 more

I've increased the ulimit to 8K from the standard 1K ... but still gets hit. So it seems that
basically there are many fd's that never get closed.
lsof shows that there are many fd's hanging on to SSTables...albeit it's only a small subset
of unique files. In my case there were only about 4 distinct SSTable files...but kept opened
hundreds of time each. 
All of these files seem to be "Data" files....no Filter or Index ones (as far as I remember)

In case it matters:  DiskAccessMode 'auto' determined to be standard, indexAccessMode is standard

The service doesn't have a high rate of writes at the moment...I have 3 nodes, 1 being receiving
mostly the thrift entry point. I see the problem being more acute on the 2 nodes that instead
of being contacted by the clients, they just get the writes from the coordinator...(I have
RF=2). 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message