hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-3124) DFS data node should not use hard coded 10 minutes as write timeout.
Date Sat, 29 Mar 2008 14:24:24 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Runping Qi updated HADOOP-3124:
-------------------------------

    Component/s: dfs
    Description: 
This problem happens in 0.17 trunk

I saw reducers waited 10 minutes for writing data to dfs and got timeout.
The client retries again and timeouted after another 19 minutes.

After looking into the code, it seems that the dfs data node uses 10 minutes as timeout for
wtiting data into the data node pipeline.
I thing we have three issues:

1. The 10 minutes timeout value is too big for writing a chunk of data (64K) through the data
node pipeline.
2. The timeout value should not be hard coded.
3. Different datanodes in a pipeline should use different timeout values for writing to the
downstream.
A reasonable one maybe (20 secs * numOfDataNodesInTheDownStreamPipe).
For example, if the replication factor is 3, the client uses 60 secs, the first data node
use 40 secs, the second datanode use 20secs.


  was:

This problem happens in 0.17 trunk

I saw reducers waited 10 minutes for writing data to dfs and got timeout.
The client retries again and timeouted after another 19 minutes.

After looking into the code, it seems that the dfs data node uses 10 minutes as timeout for
wtiting data into the data node pipeline.
I thing we have three issues:

1. The 10 minutes timeout value is too big for writing a chunk of data (64K) through the data
node pipeline.
2. The timeout value should not be hard coded.
3. Different datanodes in a pipeline should use different timeout values for writing to the
downstream.
A reasonable one maybe (20 secs * numOfDataNodesInTheDownStreamPipe).
For example, if the replication factor is 3, the client uses 60 secs, the first data node
use 40 secs, the second datanode use 20secs.



> DFS data node should not use hard coded 10 minutes as write timeout.
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3124
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3124
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Runping Qi
>
> This problem happens in 0.17 trunk
> I saw reducers waited 10 minutes for writing data to dfs and got timeout.
> The client retries again and timeouted after another 19 minutes.
> After looking into the code, it seems that the dfs data node uses 10 minutes as timeout
for wtiting data into the data node pipeline.
> I thing we have three issues:
> 1. The 10 minutes timeout value is too big for writing a chunk of data (64K) through
the data node pipeline.
> 2. The timeout value should not be hard coded.
> 3. Different datanodes in a pipeline should use different timeout values for writing
to the downstream.
> A reasonable one maybe (20 secs * numOfDataNodesInTheDownStreamPipe).
> For example, if the replication factor is 3, the client uses 60 secs, the first data
node use 40 secs, the second datanode use 20secs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message