hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yiqun Lin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-11142) TestLargeBlockReport#testBlockReportSucceedsWithLargerLengthLimit fails in trunk
Date Tue, 15 Nov 2016 15:44:58 GMT

     [ https://issues.apache.org/jira/browse/HDFS-11142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yiqun Lin updated HDFS-11142:
-----------------------------
    Description: 
The test {{TestLargeBlockReport#testBlockReportSucceedsWithLargerLengthLimit}} fails in trunk.
I looked into this, it seemed the long-time gc caused the datanode to be shutdown unexpectedly
when did the large block reporting. And then the NPE threw in the test. The related output
log:
{code}
2016-11-15 11:31:18,889 [DataNode: [[[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data1,
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data2]]
 heartbeating to localhost/127.0.0.1:51450] INFO  datanode.DataNode (BPServiceActor.java:blockReport(415))
- Successfully sent block report 0x2ae5dd91bec02273,  containing 2 storage report(s), of which
we sent 2. The reports had 0 total blocks and used 1 RPC(s). This took 0 msec to generate
and 49 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5.
2016-11-15 11:31:18,890 [DataNode: [[[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data1,
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data2]]
 heartbeating to localhost/127.0.0.1:51450] INFO  datanode.DataNode (BPOfferService.java:processCommandFromActive(696))
- Got finalize command for block pool BP-814229154-172.17.0.3-1479209475497
2016-11-15 11:31:24,026 [org.apache.hadoop.util.JvmPauseMonitor$Monitor@97e93f1] INFO  util.JvmPauseMonitor
(JvmPauseMonitor.java:run(205)) - Detected pause in JVM or host machine (eg GC): pause of
approximately 4936ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=4194ms
GC pool 'PS Scavenge' had collection(s): count=1 time=765ms
2016-11-15 11:31:24,026 [org.apache.hadoop.util.JvmPauseMonitor$Monitor@5a4bef8] INFO  util.JvmPauseMonitor
(JvmPauseMonitor.java:run(205)) - Detected pause in JVM or host machine (eg GC): pause of
approximately 4898ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=4194ms
GC pool 'PS Scavenge' had collection(s): count=1 time=765ms
2016-11-15 11:31:24,114 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1943))
- Shutting down the Mini HDFS Cluster
2016-11-15 11:31:24,114 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdownDataNodes(1983))
- Shutting down DataNode 0
{code}
The stack infos:
{code}
java.lang.NullPointerException: null
	at org.apache.hadoop.hdfs.server.datanode.TestLargeBlockReport.testBlockReportSucceedsWithLargerLengthLimit(TestLargeBlockReport.java:97)
{code}


  was:
The test {{TestLargeBlockReport#testBlockReportSucceedsWithLargerLengthLimit}} fails in trunk.
I looked into this, it seemed the long-time gc caused the datanode to be shutdown unexpectedly
when did the large block reporting. And then the NPE thew in the test. The related output
log:
{code}
2016-11-15 11:31:18,889 [DataNode: [[[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data1,
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data2]]
 heartbeating to localhost/127.0.0.1:51450] INFO  datanode.DataNode (BPServiceActor.java:blockReport(415))
- Successfully sent block report 0x2ae5dd91bec02273,  containing 2 storage report(s), of which
we sent 2. The reports had 0 total blocks and used 1 RPC(s). This took 0 msec to generate
and 49 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5.
2016-11-15 11:31:18,890 [DataNode: [[[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data1,
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data2]]
 heartbeating to localhost/127.0.0.1:51450] INFO  datanode.DataNode (BPOfferService.java:processCommandFromActive(696))
- Got finalize command for block pool BP-814229154-172.17.0.3-1479209475497
2016-11-15 11:31:24,026 [org.apache.hadoop.util.JvmPauseMonitor$Monitor@97e93f1] INFO  util.JvmPauseMonitor
(JvmPauseMonitor.java:run(205)) - Detected pause in JVM or host machine (eg GC): pause of
approximately 4936ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=4194ms
GC pool 'PS Scavenge' had collection(s): count=1 time=765ms
2016-11-15 11:31:24,026 [org.apache.hadoop.util.JvmPauseMonitor$Monitor@5a4bef8] INFO  util.JvmPauseMonitor
(JvmPauseMonitor.java:run(205)) - Detected pause in JVM or host machine (eg GC): pause of
approximately 4898ms
GC pool 'PS MarkSweep' had collection(s): count=1 time=4194ms
GC pool 'PS Scavenge' had collection(s): count=1 time=765ms
2016-11-15 11:31:24,114 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1943))
- Shutting down the Mini HDFS Cluster
2016-11-15 11:31:24,114 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdownDataNodes(1983))
- Shutting down DataNode 0
{code}
The stack infos:
{code}
java.lang.NullPointerException: null
	at org.apache.hadoop.hdfs.server.datanode.TestLargeBlockReport.testBlockReportSucceedsWithLargerLengthLimit(TestLargeBlockReport.java:97)
{code}



> TestLargeBlockReport#testBlockReportSucceedsWithLargerLengthLimit fails in trunk
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-11142
>                 URL: https://issues.apache.org/jira/browse/HDFS-11142
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>         Attachments: test-fails-log.txt
>
>
> The test {{TestLargeBlockReport#testBlockReportSucceedsWithLargerLengthLimit}} fails
in trunk. I looked into this, it seemed the long-time gc caused the datanode to be shutdown
unexpectedly when did the large block reporting. And then the NPE threw in the test. The related
output log:
> {code}
> 2016-11-15 11:31:18,889 [DataNode: [[[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data1,
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data2]]
 heartbeating to localhost/127.0.0.1:51450] INFO  datanode.DataNode (BPServiceActor.java:blockReport(415))
- Successfully sent block report 0x2ae5dd91bec02273,  containing 2 storage report(s), of which
we sent 2. The reports had 0 total blocks and used 1 RPC(s). This took 0 msec to generate
and 49 msecs for RPC and NN processing. Got back one command: FinalizeCommand/5.
> 2016-11-15 11:31:18,890 [DataNode: [[[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data1,
[DISK]file:/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/2/dfs/data/data2]]
 heartbeating to localhost/127.0.0.1:51450] INFO  datanode.DataNode (BPOfferService.java:processCommandFromActive(696))
- Got finalize command for block pool BP-814229154-172.17.0.3-1479209475497
> 2016-11-15 11:31:24,026 [org.apache.hadoop.util.JvmPauseMonitor$Monitor@97e93f1] INFO
 util.JvmPauseMonitor (JvmPauseMonitor.java:run(205)) - Detected pause in JVM or host machine
(eg GC): pause of approximately 4936ms
> GC pool 'PS MarkSweep' had collection(s): count=1 time=4194ms
> GC pool 'PS Scavenge' had collection(s): count=1 time=765ms
> 2016-11-15 11:31:24,026 [org.apache.hadoop.util.JvmPauseMonitor$Monitor@5a4bef8] INFO
 util.JvmPauseMonitor (JvmPauseMonitor.java:run(205)) - Detected pause in JVM or host machine
(eg GC): pause of approximately 4898ms
> GC pool 'PS MarkSweep' had collection(s): count=1 time=4194ms
> GC pool 'PS Scavenge' had collection(s): count=1 time=765ms
> 2016-11-15 11:31:24,114 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1943))
- Shutting down the Mini HDFS Cluster
> 2016-11-15 11:31:24,114 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdownDataNodes(1983))
- Shutting down DataNode 0
> {code}
> The stack infos:
> {code}
> java.lang.NullPointerException: null
> 	at org.apache.hadoop.hdfs.server.datanode.TestLargeBlockReport.testBlockReportSucceedsWithLargerLengthLimit(TestLargeBlockReport.java:97)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message