Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-dev@hadoop.apache.org
Message-ID: <1664404254.1207635324645.JavaMail.jira@brutus>
Date: Mon, 7 Apr 2008 23:15:24 -0700 (PDT)
From: "Runping Qi (JIRA)" <jira@apache.org>
To: core-dev@hadoop.apache.org
Subject: [jira] Commented: (HADOOP-3124) DFS data node should not use hard
 coded 10 minutes as write timeout.
In-Reply-To: <1506253783.1206736104348.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HADOOP-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12586669#action_12586669 ] 

Runping Qi commented on HADOOP-3124:
------------------------------------


there seems to be at least one place where the constant is still used as the timeout value:
{code}
@@ -848,7 +859,7 @@
   /* utility function for sending a respose */
   private static void sendResponse(Socket s, short opStatus) throws IOException {
     DataOutputStream reply = 
-      new DataOutputStream(new SocketOutputStream(s, WRITE_TIMEOUT));
+      new DataOutputStream(NetUtils.getOutputStream(s, WRITE_TIMEOUT));
{code}

Is this intended?


> DFS data node should not use hard coded 10 minutes as write timeout.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3124
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3124
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Raghu Angadi
>         Attachments: HADOOP-3124.patch
>
>
> This problem happens in 0.17 trunk
> I saw reducers waited 10 minutes for writing data to dfs and got timeout.
> The client retries again and timeouted after another 19 minutes.
> After looking into the code, it seems that the dfs data node uses 10 minutes as timeout for wtiting data into the data node pipeline.
> I thing we have three issues:
> 1. The 10 minutes timeout value is too big for writing a chunk of data (64K) through the data node pipeline.
> 2. The timeout value should not be hard coded.
> 3. Different datanodes in a pipeline should use different timeout values for writing to the downstream.
> A reasonable one maybe (20 secs * numOfDataNodesInTheDownStreamPipe).
> For example, if the replication factor is 3, the client uses 60 secs, the first data node use 40 secs, the second datanode use 20secs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.