hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2980) Fetch failures and other related issues in Jetty 6.1.26
Date Mon, 12 Sep 2011 18:47:09 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13102910#comment-13102910
] 

Todd Lipcon commented on MAPREDUCE-2980:
----------------------------------------

FWIW, I ran 150 sleep jobs over the weekend, each with 10,000 map by 10,000 reduce. Didn't
see any fetch failures, and none of the TTs got stuck in the "spinning" state.

The branch I tested is here: https://github.com/toddlipcon/jetty-hadoop-fix

> Fetch failures and other related issues in Jetty 6.1.26
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-2980
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2980
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.205.0, 0.23.0
>            Reporter: Todd Lipcon
>            Priority: Critical
>
> Since upgrading Jetty from 6.1.14 to 6.1.26 we've had a ton of HTTP-related issues, including:
> - Much higher incidence of fetch failures
> - A few strange file-descriptor related bugs (eg MAPREDUCE-2389)
> - A few unexplained issues where long "fsck"s on the NameNode drop out halfway through
with a ClosedChannelException
> Stress tests with 10000Map x 10000Reduce sleep jobs reliably reproduce fetch failures
at a rate of about 1 per million on a 25 node test cluster. These problems are all new since
the upgrade from 6.1.14.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message