hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nandakumar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12367) Ozone: Too many open files error while running corona
Date Thu, 31 Aug 2017 20:32:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149554#comment-16149554
] 

Nandakumar commented on HDFS-12367:
-----------------------------------

If corona was executed in one of the datanode, HDFS-12382 would have caused "Too many open
files". I was getting the same error, after applying patch for HDFS-12382 the issue got resolved
in my local.

Also noticed the following in output of {{lsof}} command for corona process

{code}
java      9876 nvadivelu  357u     IPv4 0xc4e9fc0d68262f8d        0t0      TCP 10.200.4.230:52234->10.200.4.230:50011
(ESTABLISHED)
java      9876 nvadivelu  358      PIPE 0xc4e9fc0d5e18843d      16384          ->0xc4e9fc0d5e18afbd
java      9876 nvadivelu  359      PIPE 0xc4e9fc0d5e18afbd      16384          ->0xc4e9fc0d5e18843d
java      9876 nvadivelu  360u   KQUEUE                                        count=0, state=0xa
java      9876 nvadivelu  361      PIPE 0xc4e9fc0d5e18837d      16384          ->0xc4e9fc0d5e18a3bd
java      9876 nvadivelu  362      PIPE 0xc4e9fc0d5e1882bd      16384          ->0xc4e9fc0d5e18837d
java      9876 nvadivelu  363u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  364      PIPE 0xc4e9fc0d5e1881fd      16384          ->0xc4e9fc0d5e18813d
java      9876 nvadivelu  365      PIPE 0xc4e9fc0d5e18813d      16384          ->0xc4e9fc0d5e1881fd
java      9876 nvadivelu  366u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  367      PIPE 0xc4e9fc0d5e18807d      16384          ->0xc4e9fc0d70ba59fd
java      9876 nvadivelu  368      PIPE 0xc4e9fc0d70ba59fd      16384          ->0xc4e9fc0d5e18807d
java      9876 nvadivelu  369u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  370      PIPE 0xc4e9fc0d70ba5ffd      16384          ->0xc4e9fc0d70ba56fd
java      9876 nvadivelu  371      PIPE 0xc4e9fc0d70ba56fd      16384          ->0xc4e9fc0d70ba5ffd
java      9876 nvadivelu  372u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  373      PIPE 0xc4e9fc0d70ba5f3d      16384          ->0xc4e9fc0d70ba563d
java      9876 nvadivelu  374      PIPE 0xc4e9fc0d70ba563d      16384          ->0xc4e9fc0d70ba5f3d
java      9876 nvadivelu  375u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  376      PIPE 0xc4e9fc0d70ba4c7d      16384          ->0xc4e9fc0d70ba69bd
java      9876 nvadivelu  377      PIPE 0xc4e9fc0d70ba69bd      16384          ->0xc4e9fc0d70ba4c7d
java      9876 nvadivelu  378u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  379      PIPE 0xc4e9fc0d70ba6e3d      16384          ->0xc4e9fc0d70ba497d
java      9876 nvadivelu  380      PIPE 0xc4e9fc0d70ba497d      16384          ->0xc4e9fc0d70ba6e3d
java      9876 nvadivelu  381u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  382      PIPE 0xc4e9fc0d70ba57bd      16384          ->0xc4e9fc0d6fdd9b3d
java      9876 nvadivelu  383      PIPE 0xc4e9fc0d6fdd9b3d      16384          ->0xc4e9fc0d70ba57bd
java      9876 nvadivelu  384u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  385      PIPE 0xc4e9fc0d6fdd9d7d      16384          ->0xc4e9fc0d6fdd9efd
java      9876 nvadivelu  386      PIPE 0xc4e9fc0d6fdd9efd      16384          ->0xc4e9fc0d6fdd9d7d
java      9876 nvadivelu  387u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  388      PIPE 0xc4e9fc0d6fdd9e3d      16384          ->0xc4e9fc0d658559fd
java      9876 nvadivelu  389      PIPE 0xc4e9fc0d658559fd      16384          ->0xc4e9fc0d6fdd9e3d
java      9876 nvadivelu  390u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  391      PIPE 0xc4e9fc0d65855abd      16384          ->0xc4e9fc0d6585593d
java      9876 nvadivelu  392      PIPE 0xc4e9fc0d6585593d      16384          ->0xc4e9fc0d65855abd
java      9876 nvadivelu  393u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  394      PIPE 0xc4e9fc0d6585587d      16384          ->0xc4e9fc0d65855b7d
java      9876 nvadivelu  395      PIPE 0xc4e9fc0d65855b7d      16384          ->0xc4e9fc0d6585587d
java      9876 nvadivelu  396u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  397      PIPE 0xc4e9fc0d658557bd      16384          ->0xc4e9fc0d65855c3d
java      9876 nvadivelu  398      PIPE 0xc4e9fc0d65855c3d      16384          ->0xc4e9fc0d658557bd
java      9876 nvadivelu  399u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  400      PIPE 0xc4e9fc0d65855cfd      16384          ->0xc4e9fc0d658556fd
java      9876 nvadivelu  401      PIPE 0xc4e9fc0d658556fd      16384          ->0xc4e9fc0d65855cfd
java      9876 nvadivelu  402u   KQUEUE                                        count=0, state=0x8
java      9876 nvadivelu  403      PIPE 0xc4e9fc0d6585563d      16384          ->0xc4e9fc0d65855dbd
java      9876 nvadivelu  404      PIPE 0xc4e9fc0d65855dbd      16384          ->0xc4e9fc0d6585563d
java      9876 nvadivelu  405u   KQUEUE                                        count=0, state=0x8
....... truncated
{code}


> Ozone: Too many open files error while running corona
> -----------------------------------------------------
>
>                 Key: HDFS-12367
>                 URL: https://issues.apache.org/jira/browse/HDFS-12367
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone, tools
>            Reporter: Weiwei Yang
>            Assignee: Mukul Kumar Singh
>
> Too many open files error keeps happening to me while using corona, I have simply setup
a single node cluster and run corona to generate 1000 keys, but I keep getting following error
> {noformat}
> ./bin/hdfs corona -numOfThreads 1 -numOfVolumes 1 -numOfBuckets 1 -numOfKeys 1000
> 17/08/28 00:47:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for
your platform... using builtin-java classes where applicable
> 17/08/28 00:47:42 INFO tools.Corona: Number of Threads: 1
> 17/08/28 00:47:42 INFO tools.Corona: Mode: offline
> 17/08/28 00:47:42 INFO tools.Corona: Number of Volumes: 1.
> 17/08/28 00:47:42 INFO tools.Corona: Number of Buckets per Volume: 1.
> 17/08/28 00:47:42 INFO tools.Corona: Number of Keys per Bucket: 1000.
> 17/08/28 00:47:42 INFO rpc.OzoneRpcClient: Creating Volume: vol-0-05000, with wwei as
owner and quota set to 1152921504606846976 bytes.
> 17/08/28 00:47:42 INFO tools.Corona: Starting progress bar Thread.
> ...
> ERROR tools.Corona: Exception while adding key: key-251-19293 in bucket: bucket-0-34960
of volume: vol-0-05000.
> java.io.IOException: Exception getting XceiverClient.
> 	at org.apache.hadoop.scm.XceiverClientManager.getClient(XceiverClientManager.java:156)
> 	at org.apache.hadoop.scm.XceiverClientManager.acquireClient(XceiverClientManager.java:122)
> 	at org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.getFromKsmKeyInfo(ChunkGroupOutputStream.java:289)
> 	at org.apache.hadoop.ozone.client.rpc.OzoneRpcClient.createKey(OzoneRpcClient.java:487)
> 	at org.apache.hadoop.ozone.tools.Corona$OfflineProcessor.run(Corona.java:352)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalStateException:
failed to create a child event loop
> 	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2234)
> 	at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
> 	at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4764)
> 	at org.apache.hadoop.scm.XceiverClientManager.getClient(XceiverClientManager.java:144)
> 	... 9 more
> Caused by: java.lang.IllegalStateException: failed to create a child event loop
> 	at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:68)
> 	at io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:49)
> 	at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:61)
> 	at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:52)
> 	at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:44)
> 	at io.netty.channel.nio.NioEventLoopGroup.<init>(NioEventLoopGroup.java:36)
> 	at org.apache.hadoop.scm.XceiverClient.connect(XceiverClient.java:76)
> 	at org.apache.hadoop.scm.XceiverClientManager$2.call(XceiverClientManager.java:151)
> 	at org.apache.hadoop.scm.XceiverClientManager$2.call(XceiverClientManager.java:145)
> 	at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4767)
> 	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
> 	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
> 	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
> 	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
> 	... 12 more
> Caused by: io.netty.channel.ChannelException: failed to open a new selector
> 	at io.netty.channel.nio.NioEventLoop.openSelector(NioEventLoop.java:128)
> 	at io.netty.channel.nio.NioEventLoop.<init>(NioEventLoop.java:120)
> 	at io.netty.channel.nio.NioEventLoopGroup.newChild(NioEventLoopGroup.java:87)
> 	at io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:64)
> 	... 25 more
> Caused by: java.io.IOException: Too many open files
> 	at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method)
> 	at sun.nio.ch.EPollArrayWrapper.<init>(EPollArrayWrapper.java:130)
> 	at sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:69)
> 	at sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:36)
> 	at io.netty.channel.nio.NioEventLoop.openSelector(NioEventLoop.java:126)
> 	... 28 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message