incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Downing <tdown...@proteus-technologies.com>
Subject Re: Too many open files [was Re: Minimizing the impact of compaction on latency and throughput]
Date Wed, 14 Jul 2010 15:23:39 GMT
On 7/14/2010 11:07 AM, Jonathan Ellis wrote:
> socketexception means this is coming from the network, not the sstables
>
> knowing the full error message would be nice, but just about any
> problem on that end should be fixed by adding connection pooling to
> your client.
>
> (moving to user@)
>
> On Wed, Jul 14, 2010 at 5:09 AM, Thomas Downing
> <tdowning@proteus-technologies.com>  wrote:
>    
>> On 7/13/2010 9:20 AM, Jonathan Ellis wrote:
>>      
>>> On Tue, Jul 13, 2010 at 4:19 AM, Thomas Downing
>>> <tdowning@proteus-technologies.com>    wrote:
>>>
>>>        
>>>> On a related note:  I am running some feasibility tests looking for
>>>> high ingest rate capabilities.  While testing Cassandra the problem
>>>> I've encountered is that it runs out of file handles during compaction.
>>>>
>>>>          
>>>
[snip]
I'm not sure that is the case.

When the server gets into the unrecoverable state, the repeating exceptions
are indeed "SocketException: Too many open files".

WARN [main] 2010-07-14 06:08:46,772 TThreadPoolServer.java (line 190) 
Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: 
java.net.SocketException: Too many open files
     at 
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124)
     at 
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:35)
     at 
org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
     at 
org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:184)
     at 
org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:149)
     at 
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:190)
Caused by: java.net.SocketException: Too many open files
     at java.net.PlainSocketImpl.socketAccept(Native Method)
     at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
     at java.net.ServerSocket.implAccept(ServerSocket.java:453)
     at java.net.ServerSocket.accept(ServerSocket.java:421)
     at 
org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)
     ... 5 more

Although this is unquestionably a network error,  I don't think it is 
actually a
network problem per se, as the maximum number of sockets open by the
Cassandra server is at this point is about 8.  When I kill the client, 
sockets
held are just the listening sockets - no sockets in ESTABLISHED or
TIMED_WAIT.

I was originally using the client interface provided by Hector, but went 
to the
direct thrift API to eliminate moving parts in the puzzle.  When using 
Hector,
I was using the ClientConnectionPool. Either way, the behavior is the same.

Just a further note:  my client test jig acquires a single connection, 
then uses
that connection for successive batch_mutate operations, with out closing.
It only closes the connection on an exception, or at the end of the run.  If
it would be helpful, I can change that to open/mutate/close and repeat to
see what happens.

Thanks
Thomas Downing


Mime
View raw message