hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jianfei Jiang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HADOOP-15108) Testcase TestBalancer#testBalancerWithPinnedBlocks always fails
Date Mon, 11 Dec 2017 11:51:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16285797#comment-16285797
] 

Jianfei Jiang edited comment on HADOOP-15108 at 12/11/17 11:50 AM:
-------------------------------------------------------------------

Is anyone run this case successfully? If you succeed, could you please tell me how to do.


Due to debugging experience and the error log message, there may be something wrong with the
code below. There are two favoredNodes both target to local, the file: /tmp.txt seems to have
lease conflict. 

    DFSTestUtil.createFile(cluster.getFileSystem(0), filePath, false, 1024,
        totalUsedSpace / numOfDatanodes, DEFAULT_BLOCK_SIZE,
        (short) numOfDatanodes, 0, false, favoredNodes);

In the preliminary patch, I change the two datanodes in the cluster to only one, then the
testcase runs successfully. It will only remain only one favoredNode and have no conflict.
In my opinion, the testcase will still reach its goal when given only one node at the beginning.
However, I am not certain about it.

The following is the error log:

2017-12-11 18:45:54,063 [PacketResponder: BP-197616310-127.0.1.1-1512989063241:blk_1073741827_1003,
type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[127.0.0.1:37715]] INFO  datanode.DataNode
(BlockReceiver.java:run(1497)) - PacketResponder: BP-197616310-127.0.1.1-1512989063241:blk_1073741827_1003,
type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[127.0.0.1:37715] terminating
2017-12-11 18:46:02,292 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1957))
- Shutting down the Mini HDFS Cluster
2017-12-11 18:46:02,293 [DataStreamer for file /tmp.txt] WARN  hdfs.DataStreamer (DataStreamer.java:run(843))
- DataStreamer Exception
java.io.InterruptedIOException: Call interrupted
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1484)
	at org.apache.hadoop.ipc.Client.call(Client.java:1436)
	at org.apache.hadoop.ipc.Client.call(Client.java:1346)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy25.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:495)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy26.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1031)
	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1882)
	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1685)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:733)
2017-12-11 18:46:02,298 [main] ERROR hdfs.DFSClient (DFSClient.java:closeAllFilesBeingWritten(602))
- Failed to close file: /tmp.txt with inode: 16386
java.io.InterruptedIOException: Call interrupted
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1484)
	at org.apache.hadoop.ipc.Client.call(Client.java:1436)
	at org.apache.hadoop.ipc.Client.call(Client.java:1346)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy25.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:495)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy26.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1031)
	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1882)
	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1685)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:733)
2017-12-11 18:46:02,299 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdownDataNode(2005))
- Shutting down DataNode 1


was (Author: jiangjianfei):
Due to debugging experience and the error log message, there may be something wrong with the
code below. There are two favoredNodes both target to local, the file: /tmp.txt seems to have
lease conflict. 

    DFSTestUtil.createFile(cluster.getFileSystem(0), filePath, false, 1024,
        totalUsedSpace / numOfDatanodes, DEFAULT_BLOCK_SIZE,
        (short) numOfDatanodes, 0, false, favoredNodes);

When I change the two datanodes in the cluster to only one which shown in my patch, the testcase
runs successfully. It will only remain only one favoredNode and have no conflict. In my opinion,
the testcase will still reach its goal when given only one node at the beginning. However,
I am not certain about it.

The following is the error log:

2017-12-11 18:45:54,063 [PacketResponder: BP-197616310-127.0.1.1-1512989063241:blk_1073741827_1003,
type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[127.0.0.1:37715]] INFO  datanode.DataNode
(BlockReceiver.java:run(1497)) - PacketResponder: BP-197616310-127.0.1.1-1512989063241:blk_1073741827_1003,
type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[127.0.0.1:37715] terminating
2017-12-11 18:46:02,292 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdown(1957))
- Shutting down the Mini HDFS Cluster
2017-12-11 18:46:02,293 [DataStreamer for file /tmp.txt] WARN  hdfs.DataStreamer (DataStreamer.java:run(843))
- DataStreamer Exception
java.io.InterruptedIOException: Call interrupted
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1484)
	at org.apache.hadoop.ipc.Client.call(Client.java:1436)
	at org.apache.hadoop.ipc.Client.call(Client.java:1346)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy25.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:495)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy26.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1031)
	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1882)
	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1685)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:733)
2017-12-11 18:46:02,298 [main] ERROR hdfs.DFSClient (DFSClient.java:closeAllFilesBeingWritten(602))
- Failed to close file: /tmp.txt with inode: 16386
java.io.InterruptedIOException: Call interrupted
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1484)
	at org.apache.hadoop.ipc.Client.call(Client.java:1436)
	at org.apache.hadoop.ipc.Client.call(Client.java:1346)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy25.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:495)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy26.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1031)
	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1882)
	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1685)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:733)
2017-12-11 18:46:02,299 [main] INFO  hdfs.MiniDFSCluster (MiniDFSCluster.java:shutdownDataNode(2005))
- Shutting down DataNode 1

> Testcase TestBalancer#testBalancerWithPinnedBlocks always fails
> ---------------------------------------------------------------
>
>                 Key: HADOOP-15108
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15108
>             Project: Hadoop Common
>          Issue Type: Test
>    Affects Versions: 3.0.0-beta1
>            Reporter: Jianfei Jiang
>         Attachments: HADOOP-15108.000.patch
>
>
> When running testcases without any code changes, the function testBalancerWithPinnedBlocks
in TestBalancer.java never succeeded. I tried to use Ubuntu 16.04 and redhat 7, maybe the
failure is not related to various linux environment. I am not sure if there is some bug in
this case or I used wrong environment and settings. Could anyone give some advice.
> -------------------------------------------------------------------------------
> Test set: org.apache.hadoop.hdfs.server.balancer.TestBalancer
> -------------------------------------------------------------------------------
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 100.389 sec <<<
FAILURE! - in org.apache.hadoop.hdfs.server.balancer.TestBalancer
> testBalancerWithPinnedBlocks(org.apache.hadoop.hdfs.server.balancer.TestBalancer)  Time
elapsed: 100.134 sec  <<< ERROR!
> java.lang.Exception: test timed out after 100000 milliseconds
> 	at java.lang.Object.wait(Native Method)
> 	at org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:903)
> 	at org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:773)
> 	at org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:870)
> 	at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:842)
> 	at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> 	at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
> 	at org.apache.hadoop.hdfs.DFSTestUtil.createFile(DFSTestUtil.java:441)
> 	at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerWithPinnedBlocks(TestBalancer.java:515)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message