hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Soundararajan Velu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-183) MapReduce Streaming job hang when all replications of the input file has corrupted!
Date Tue, 22 Jun 2010 06:24:54 GMT

    [ https://issues.apache.org/jira/browse/HDFS-183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881090#action_12881090
] 

Soundararajan Velu commented on HDFS-183:
-----------------------------------------

Zhu, I tried reproeducing this issue in our cluster with no luck... The dfs client retries
for 5 times and then throws an IO exception and then terminates the operation. Please let
me know if you are still facing this issue. 

> MapReduce Streaming job hang when all replications of the input file has corrupted!
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-183
>                 URL: https://issues.apache.org/jira/browse/HDFS-183
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: ZhuGuanyin
>            Priority: Critical
>
> On some special cases, all replications of a given file has truncated to zero  but the
namenode still hold the original size (we don't know why),  the mapreduce streaming job will
hang if we don't specified mapred.task.timeout when the input files contain this corrupted
file, even the dfs shell "cat" will hang when fetch data from this corrupted file.
> We found that job hang at DFSInputStream.blockSeekTo() when chosing a datanode.  The
following test will show:
> 1)	Copy a small file to hdfs. 
> 2)	Get the file blocks and login to these datanodes, and truncate these blocks to zero.
> 3)	Cat this file through dfs shell "cat"
> 4)	Cat command will enter dead loop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message