Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A0C17C12B for ; Thu, 3 May 2012 18:51:12 +0000 (UTC) Received: (qmail 12517 invoked by uid 500); 3 May 2012 18:51:12 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 12467 invoked by uid 500); 3 May 2012 18:51:12 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 12459 invoked by uid 99); 3 May 2012 18:51:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 May 2012 18:51:12 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 May 2012 18:51:10 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 1612F42E5FD for ; Thu, 3 May 2012 18:50:49 +0000 (UTC) Date: Thu, 3 May 2012 18:50:49 +0000 (UTC) From: "Todd Lipcon (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1546686765.23122.1336071049098.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <819749741.20485.1336022178187.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (HDFS-3357) DataXceiver reads from client socket with incorrect/no timeout MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-3357: ------------------------------ Attachment: hdfs-3357.txt > DataXceiver reads from client socket with incorrect/no timeout > -------------------------------------------------------------- > > Key: HDFS-3357 > URL: https://issues.apache.org/jira/browse/HDFS-3357 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node > Affects Versions: 1.0.2, 2.0.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Critical > Attachments: hdfs-3357.txt > > > In DataXceiver, we currently use Socket.setSoTimeout to try to manage the read timeout when switching between reading the initial opCode, reading a keepalive opcode, and reading the status after a successfully sent block. However, since all of these reads use the same underlying DataInputStream, the change to the socket timeout isn't respected. Thus, they all occur with whatever timeout is set on the socket at the time of DataXceiver construction. In practice this turns out to be 0, which can cause infinitely hung xceivers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira