hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dagaev <michael.dag...@gmail.com>
Subject Too many open files
Date Tue, 20 Jan 2009 11:00:13 GMT
Hi, all

       It looks like that one of our region servers run out of file descriptors,
cannot open an epoll and shut down (see exception below).

I read the FAQ and will increase the number of file descriptors per process
Unfortunately, I did not understand the FAQ explanation.

As the FAQ says, region servers open 3 files per column family in average.
However, we have only ~10 column families. Besides, in our case a region
server opens mostly IPCs, i.e. epolls, pipes, and TCP conn. rather than files.

Can anybody explain that?

As I see, a region server holds: ~150 open epolls, ~300 open pipes,
~150 open TCP connections to itself (port 50010).

Is it ok? Why does a region server need so many IPCs?
Why does it use TCP connections as local IPC? Isn't it too expensive?

Now let's say that the region server run out of file descriptors and cannot open
a new IPC. Can it continue working using ~600 IPCs it opened before?

Thank you for your cooperation,

P.S. The exception stack trace
< date and time ... >WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Processing message
(Retry: 0)
java.io.IOException: Call failed on local exception
        at org.apache.hadoop.ipc.Client.call(Client.java:718)
        at org.apache.hadoop.hbase.ipc.HbaseRPC$Invoker.invoke(HbaseRPC.java:245)
        at $Proxy0.regionServerReport(Unknown Source)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:311)
        at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: Too many open files
        at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method)
        at sun.nio.ch.EPollArrayWrapper.<init>(EPollArrayWrapper.java:68)
        at sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:52)
        at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18)
        at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.get(SocketIOWithTimeout.java:335)
        at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:250)
        at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:155)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:150)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:123)
        at java.io.FilterInputStream.read(FilterInputStream.java:116)
        at org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:272)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:499)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:441)

View raw message