hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yongjun Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3342) SocketTimeoutException in BlockSender.sendChunks could have a better error message
Date Tue, 21 Oct 2014 00:19:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14177704#comment-14177704
] 

Yongjun Zhang commented on HDFS-3342:
-------------------------------------

HI [~andrew.wang],

Thanks a lot for the review and comments!

Good catch of yours. Indeed, if user set the log level to WARN, then the new message I added
won't be seen.  

The "WARN" message was there before I made this change, and it's intended to report the stack
trace all IOException. The new message I added tried to say "Likely the client has stopped
reading..".  When there is a SocketTimeoutException, I guess there may be other cases of SocketTimeoutException
than the one we are dealing here. I was worried that taking out the WARN message would cause
missed reporting of some other cases.  That's why I used the word "Likely".

To address your comment, I added similar statement to the WARN msg and uploaded a new rev
(002), so similar msg will be printed at WARN log level. I wonder whether it looks good to
you.

Thanks again.


> SocketTimeoutException in BlockSender.sendChunks could have a better error message
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-3342
>                 URL: https://issues.apache.org/jira/browse/HDFS-3342
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.0.0-alpha
>            Reporter: Todd Lipcon
>            Assignee: Yongjun Zhang
>            Priority: Minor
>              Labels: supportability
>         Attachments: HDFS-3342.001.patch, HDFS-3342.002.patch
>
>
> Currently, if a client connects to a DN and begins to read a block, but then stops calling
read() for a long period of time, the DN will log a SocketTimeoutException "480000 millis
timeout while waiting for channel to be ready for write." This is because there is no "keepalive"
functionality of any kind. At a minimum, we should improve this error message to be an INFO
level log which just says that the client likely stopped reading, so disconnecting it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message