hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8160) Long delays when calling hdfsOpenFile()
Date Sun, 19 Apr 2015 17:47:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502020#comment-14502020
] 

Steve Loughran commented on HDFS-8160:
--------------------------------------

>From the stack trace
{code}
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel
to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.40.8.10:50010]
{code}
our server at 10.40.8.10 appears to be down or unreachable.

> Long delays when calling hdfsOpenFile()
> ---------------------------------------
>
>                 Key: HDFS-8160
>                 URL: https://issues.apache.org/jira/browse/HDFS-8160
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: libhdfs
>    Affects Versions: 2.5.2
>         Environment: 3-node Apache Hadoop 2.5.2 cluster running on Ubuntu 14.04 
> dfshealth overview:
> Security is off.
> Safemode is off.
> 8 files and directories, 9 blocks = 17 total filesystem object(s).
> Heap Memory used 45.78 MB of 90.5 MB Heap Memory. Max Heap Memory is 889 MB.
> Non Heap Memory used 36.3 MB of 70.44 MB Commited Non Heap Memory. Max Non Heap Memory
is 130 MB.
> Configured Capacity:	118.02 GB
> DFS Used:	2.77 GB
> Non DFS Used:	12.19 GB
> DFS Remaining:	103.06 GB
> DFS Used%:	2.35%
> DFS Remaining%:	87.32%
> Block Pool Used:	2.77 GB
> Block Pool Used%:	2.35%
> DataNodes usages% (Min/Median/Max/stdDev): 	2.35% / 2.35% / 2.35% / 0.00%
> Live Nodes	3 (Decommissioned: 0)
> Dead Nodes	0 (Decommissioned: 0)
> Decommissioning Nodes	0
> Number of Under-Replicated Blocks	0
> Number of Blocks Pending Deletion	0
> Datanode Information
> In operation
> Node	Last contact	Admin State	Capacity	Used	Non DFS Used	Remaining	Blocks	Block pool
used	Failed Volumes	Version
> hadoop252-3 (x.x.x.10:50010)	1	In Service	39.34 GB	944.85 MB	3.63 GB	34.79 GB	9	944.85
MB (2.35%)	0	2.5.2
> hadoop252-1 (x.x.x.8:50010)	0	In Service	39.34 GB	944.85 MB	4.94 GB	33.48 GB	9	944.85
MB (2.35%)	0	2.5.2
> hadoop252-2 (x.x.x.9:50010)	1	In Service	39.34 GB	944.85 MB	3.63 GB	34.79 GB	9	944.85
MB (2.35%)	0	2.5.2
> java version "1.7.0_76"
> Java(TM) SE Runtime Environment (build 1.7.0_76-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode)
>            Reporter: Rod
>
> Calling hdfsOpenFile on a file residing on target 3-node Hadoop cluster (described in
detail in Environment section) blocks for a long time (several minutes).  I've noticed that
the delay is related to the size of the target file. 
> For example, attempting to hdfsOpenFile() on a file of filesize 852483361 took 121 seconds,
but a file of 15458 took less than a second.
> Also, during the long delay, the following stacktrace is routed to standard out:
> 2015-04-16 10:32:13,943 WARN  [main] hdfs.BlockReaderFactory (BlockReaderFactory.java:getRemoteBlockReaderFromTcp(693))
- I/O error constructing remote block reader.
> org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for
channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.40.8.10:50010]
> 	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
> 	at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101)
> 	at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
> 	at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
> 	at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
> 	at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
> 	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:800)
> 	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:854)
> 	at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:143)
> 2015-04-16 10:32:13,946 WARN  [main] hdfs.DFSClient (DFSInputStream.java:blockSeekTo(612))
- Failed to connect to /10.40.8.10:50010 for block, add to deadNodes and continue. org.apache.hadoop.net.ConnectTimeoutException:
60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending
remote=/10.40.8.10:50010]
> org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for
channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.40.8.10:50010]
> 	at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
> 	at org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101)
> 	at org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
> 	at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
> 	at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
> 	at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
> 	at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:800)
> 	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:854)
> 	at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:143)
> I have also seen similar delays and stacktrace printout when executing dfs CL commands
on those same files (df -cat, df -tail, etc.).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message