hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cody Saunders (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-693) java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write exceptions were cast when trying to read file via StreamFile.
Date Mon, 02 Aug 2010 18:09:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894627#action_12894627
] 

Cody Saunders commented on HDFS-693:
------------------------------------

I just wanted to re-iterate the above statement about the portions of code where the value
is added in with the WRITE_TIMEOUT_EXTENSION constant * number of nodes...

If using '0' to get around write timeout problems is a bad practice or not is probably the
first question. If so, is it documented somewhere? If not, then logic like I've pointed out
above would break the idea of using "infinite" wait. 

I ran into timeout conditions this time starting with exception:
50010-1267539292546, infoPort=50075, ipcPort=50020):Exception writing block blk_3120944928137673159_2109400
to mirror 192.168.130.94:50010
java.net.SocketException: Broken pipe
    at java.net.SocketOutputStream.socketWrite0(Native Method)
    at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
    at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
    at java.io.DataOutputStream.write(DataOutputStream.java:90)
    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:401)
    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:524)
    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:357)
    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:103)
    at java.lang.Thread.run(Thread.java:619)


When I look at
BlockReceiver.java:401

I see it goes back to mirrorOut, defined before the call to receiveBlock in dataXceiver, as:

(line 285)          mirrorOut = new DataOutputStream(
             new BufferedOutputStream(
                         NetUtils.getOutputStream(mirrorSock, writeTimeout),
                         SMALL_BUFFER_SIZE));

with writeTimeout from line 280:

 int writeTimeout = datanode.socketWriteTimeout +
                             (HdfsConstants.WRITE_TIMEOUT_EXTENSION * numTargets);


I have avoided almost every timeout but occasional read-side timeouts in my very slow VM environment
by setting dfs.datanode.socket.write.timeout to something like 1000000, but had not yet tried
this in production, where it is still zero, and I received the timeout complaint.

At the time I was writing 1M records per hour from about 8 different clients of hbase.

> java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be
ready for write exceptions were cast when trying to read file via StreamFile.
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-693
>                 URL: https://issues.apache.org/jira/browse/HDFS-693
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.20.1
>            Reporter: Yajun Dong
>         Attachments: HDFS-693.log
>
>
> To exclude the case of network problem, I found the count of  dataXceiver is about 30.
 Also, I could see the output of netstate -a | grep 50075 has many TIME_WAIT status when this
happened.
> partial log in attachment. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message