hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Xiaoqiao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8973) NameNode exit without any exception log
Date Thu, 27 Aug 2015 12:48:45 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14716600#comment-14716600
] 

He Xiaoqiao commented on HDFS-8973:
-----------------------------------

Thanks Kanaka for your comments. The follow is the main GC messages before process exit in
.out file. actually GC works well as usual via following info and monitor system. at 19:35:02.508
when 'log4j:ERROR Failed to flush writer' appears no more logs output to .log file but GC
info continue print about 5mins until 19:40:10 and namenode process exit. at that time, enough
Memory space for JVM working.
{code:borderStyle=solid}
2015-08-26T19:34:30.537+0800: [GC [ParNew: 8315771K->63022K(9292032K), 0.1909130 secs]
96423904K->88172502K(133185344K), 0.1910150 secs] [Times: user=3.37 sys=0.01, real=0.19
secs] 
2015-08-26T19:34:42.296+0800: [GC [ParNew: 8322670K->71664K(9292032K), 0.2214550 secs]
96432150K->88183374K(133185344K), 0.2215720 secs] [Times: user=3.92 sys=0.01, real=0.22
secs] 
2015-08-26T19:34:52.412+0800: [GC [ParNew: 8331312K->82431K(9292032K), 0.2173850 secs]
96443022K->88195492K(133185344K), 0.2174950 secs] [Times: user=3.86 sys=0.00, real=0.22
secs] 
2015-08-26T19:35:02.508+0800: [GC [ParNew: 8342079K->101837K(9292032K), 0.1873830 secs]
96455140K->88216487K(133185344K), 0.1874800 secs] [Times: user=3.26 sys=0.02, real=0.18
secs]
log4j:ERROR Failed to flush writer,
java.io.IOException: 错误的文件描述符
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:318)
        at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
        at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
        at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
        at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
        at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
        at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
        at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
        at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
        at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
        at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
        at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
        at org.apache.log4j.Category.callAppenders(Category.java:206)
        at org.apache.log4j.Category.forcedLog(Category.java:391)
        at org.apache.log4j.Category.log(Category.java:856)
        at org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2391)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:2312)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:2919)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:2894)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:2976)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:5432)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1061)
        at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209)
        at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28065)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
2015-08-26T19:35:14.959+0800: [GC [ParNew: 8361485K->93419K(9292032K), 0.1904630 secs]
96476135K->88211796K(133185344K), 0.1905540 secs] [Times: user=3.38 sys=0.00, real=0.19
secs] 
2015-08-26T19:35:25.424+0800: [GC [ParNew: 8353067K->54117K(9292032K), 0.1892230 secs]
96471444K->88174133K(133185344K), 0.1893260 secs] [Times: user=3.31 sys=0.01, real=0.19
secs] 
2015-08-26T19:35:36.512+0800: [GC [ParNew: 8313765K->55946K(9292032K), 0.1901160 secs]
96433781K->88177578K(133185344K), 0.1902050 secs] [Times: user=3.37 sys=0.01, real=0.19
secs] 
2015-08-26T19:35:50.074+0800: [GC [ParNew: 8315594K->56541K(9292032K), 0.1893860 secs]
96437226K->88179797K(133185344K), 0.1895100 secs] [Times: user=3.32 sys=0.00, real=0.19
secs] 
2015-08-26T19:36:00.882+0800: [GC [ParNew: 8316189K->65465K(9292032K), 0.2305320 secs]
96439445K->88189640K(133185344K), 0.2306270 secs] [Times: user=4.07 sys=0.01, real=0.23
secs] 
2015-08-26T19:36:11.435+0800: [GC [ParNew: 8325113K->61243K(9292032K), 0.3141090 secs]
96449288K->88186270K(133185344K), 0.3141980 secs] [Times: user=5.58 sys=0.01, real=0.32
secs] 
2015-08-26T19:36:23.630+0800: [GC [ParNew: 8320891K->58345K(9292032K), 0.2225350 secs]
96445918K->88184280K(133185344K), 0.2226280 secs] [Times: user=3.95 sys=0.00, real=0.22
secs] 
2015-08-26T19:36:35.000+0800: [GC [ParNew: 8317993K->63208K(9292032K), 0.1879290 secs]
96443928K->88189787K(133185344K), 0.1880210 secs] [Times: user=3.32 sys=0.01, real=0.19
secs] 
2015-08-26T19:36:46.379+0800: [GC [ParNew: 8322856K->59213K(9292032K), 0.1976770 secs]
96449435K->88186862K(133185344K), 0.1977800 secs] [Times: user=3.48 sys=0.02, real=0.20
secs] 
2015-08-26T19:36:59.970+0800: [GC [ParNew: 8318861K->54815K(9292032K), 0.1806880 secs]
96446510K->88183754K(133185344K), 0.1807820 secs] [Times: user=3.17 sys=0.00, real=0.18
secs] 
2015-08-26T19:37:11.402+0800: [GC [ParNew: 8314463K->47310K(9292032K), 0.1848910 secs]
96443402K->88176824K(133185344K), 0.1849980 secs] [Times: user=3.28 sys=0.00, real=0.19
secs] 
2015-08-26T19:37:23.260+0800: [GC [ParNew: 8306958K->63716K(9292032K), 0.2553960 secs]
96436472K->88193857K(133185344K), 0.2554920 secs] [Times: user=4.54 sys=0.00, real=0.25
secs] 
2015-08-26T19:37:34.873+0800: [GC [ParNew: 8323364K->65742K(9292032K), 0.2601420 secs]
96453505K->88197127K(133185344K), 0.2602340 secs] [Times: user=4.62 sys=0.00, real=0.26
secs] 
2015-08-26T19:37:43.545+0800: [GC [ParNew: 8325390K->69612K(9292032K), 0.2041830 secs]
96456775K->88202278K(133185344K), 0.2043040 secs] [Times: user=3.61 sys=0.00, real=0.20
secs] 
2015-08-26T19:37:52.130+0800: [GC [ParNew: 8329260K->86022K(9292032K), 0.2420090 secs]
96461926K->88220152K(133185344K), 0.2421120 secs] [Times: user=4.29 sys=0.00, real=0.24
secs] 
2015-08-26T19:38:02.708+0800: [GC [ParNew: 8345670K->76517K(9292032K), 0.1884070 secs]
96479800K->88211611K(133185344K), 0.1885010 secs] [Times: user=3.32 sys=0.01, real=0.19
secs] 
2015-08-26T19:38:14.937+0800: [GC [ParNew: 8336165K->69821K(9292032K), 0.1947510 secs]
96471259K->88206310K(133185344K), 0.1948470 secs] [Times: user=3.43 sys=0.01, real=0.19
secs] 
2015-08-26T19:38:28.051+0800: [GC [ParNew: 8329469K->45438K(9292032K), 0.1856000 secs]
96465958K->88185155K(133185344K), 0.1856900 secs] [Times: user=3.27 sys=0.02, real=0.18
secs] 
2015-08-26T19:38:41.417+0800: [GC [ParNew: 8305086K->57656K(9292032K), 0.1942460 secs]
96444803K->88198252K(133185344K), 0.1943400 secs] [Times: user=3.44 sys=0.01, real=0.20
secs] 
2015-08-26T19:38:51.640+0800: [GC [ParNew: 8317304K->54933K(9292032K), 0.2845790 secs]
96457900K->88196188K(133185344K), 0.2846770 secs] [Times: user=5.05 sys=0.00, real=0.29
secs] 
2015-08-26T19:39:01.810+0800: [GC [ParNew: 8314581K->54957K(9292032K), 0.2662020 secs]
96455836K->88196714K(133185344K), 0.2662970 secs] [Times: user=4.73 sys=0.00, real=0.27
secs] 
2015-08-26T19:39:11.878+0800: [GC [ParNew: 8314605K->60066K(9292032K), 0.2682030 secs]
96456362K->88202581K(133185344K), 0.2683040 secs] [Times: user=4.75 sys=0.00, real=0.27
secs] 
2015-08-26T19:39:22.638+0800: [GC [ParNew: 8319714K->55948K(9292032K), 0.3573230 secs]
96462229K->88199459K(133185344K), 0.3574180 secs] [Times: user=6.35 sys=0.00, real=0.35
secs] 
2015-08-26T19:39:33.796+0800: [GC [ParNew: 8315596K->46115K(9292032K), 0.1786880 secs]
96459107K->88190471K(133185344K), 0.1787920 secs] [Times: user=3.15 sys=0.01, real=0.18
secs] 
2015-08-26T19:39:45.379+0800: [GC [ParNew: 8305763K->53258K(9292032K), 0.1913130 secs]
96450119K->88198565K(133185344K), 0.1914090 secs] [Times: user=3.37 sys=0.01, real=0.20
secs] 
2015-08-26T19:39:59.439+0800: [GC [ParNew: 8312906K->54312K(9292032K), 0.1754920 secs]
96458213K->88200194K(133185344K), 0.1756200 secs] [Times: user=3.12 sys=0.00, real=0.18
secs] 
2015-08-26T19:40:10.774+0800: [GC [ParNew: 8313960K->57648K(9292032K), 0.2059870 secs]
96459842K->88204453K(133185344K), 0.2060790 secs] [Times: user=3.64 sys=0.01, real=0.20
secs] 
Heap
 par new generation   total 9292032K, used 3291083K [0x00007ef165000000, 0x00007ef3db270000,
0x00007ef3db270000)
  eden space 8259648K,  39% used [0x00007ef165000000, 0x00007ef22a5a6c68, 0x00007ef35d210000)
  from space 1032384K,   5% used [0x00007ef39c240000, 0x00007ef39fa8c210, 0x00007ef3db270000)
  to   space 1032384K,   0% used [0x00007ef35d210000, 0x00007ef35d210000, 0x00007ef39c240000)
 concurrent mark-sweep generation total 123893312K, used 88146805K [0x00007ef3db270000, 0x00007f1165000000,
0x00007f1165000000)
 concurrent-mark-sweep perm gen total 82408K, used 50106K [0x00007f1165000000, 0x00007f116a07a000,
0x00007f116d000000)
{code}

> NameNode exit without any exception log
> ---------------------------------------
>
>                 Key: HDFS-8973
>                 URL: https://issues.apache.org/jira/browse/HDFS-8973
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: He Xiaoqiao
>            Priority: Critical
>
> namenode process exit without any useful WARN/ERROR log, and after .log file output interrupt
.out file continue show about 5 min GC log. when .log file intertupt .out file print the follow
ERROR, it may hint some info. it seems cause by log4j ERROR.
> {code:title=namenode.out|borderStyle=solid}
> log4j:ERROR Failed to flush writer,
> java.io.IOException: 错误的文件描述符
>         at java.io.FileOutputStream.writeBytes(Native Method)
>         at java.io.FileOutputStream.write(FileOutputStream.java:318)
>         at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
>         at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
>         at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
>         at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
>         at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
>         at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:59)
>         at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:324)
>         at org.apache.log4j.RollingFileAppender.subAppend(RollingFileAppender.java:276)
>         at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
>         at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
>         at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
>         at org.apache.log4j.Category.callAppenders(Category.java:206)
>         at org.apache.log4j.Category.forcedLog(Category.java:391)
>         at org.apache.log4j.Category.log(Category.java:856)
>         at org.apache.commons.logging.impl.Log4JLogger.info(Log4JLogger.java:176)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.logAddStoredBlock(BlockManager.java:2391)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addStoredBlock(BlockManager.java:2312)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processAndHandleReportedBlock(BlockManager.java:2919)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.addBlock(BlockManager.java:2894)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processIncrementalBlockReport(BlockManager.java:2976)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:5432)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1061)
>         at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209)
>         at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28065)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message