hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@apache.org>
Subject Re: [VOTE] Should we release
Date Thu, 18 Aug 2011 17:37:35 GMT

On Aug 18, 2011, at 12:28 AM, Owen O'Malley wrote:
> This vote is still running with no votes other than mine. 
> I've tested with and without security on a 60 node cluster and I'm seeing some failures,
but not that many. On a terasort with 15,000 maps and 200 reduces, I ran the following cases:
> security + linux task controller : 2 failures (both mr-2651)
> no security + default task controller : 6-7 failures (seems to be a race condition in
clean up)
> Even in the no security case, it is only losing 0.05% of the time.

	We're seeing much much higher failure rates.  In the 5-10% area.  It might very well be because
we have more cores/faster boxes.

> It isn't perfect, but this is the code that Yahoo is currently running. I think we should
release it.

	Y! can afford the task failures.  The rest of us can't.  So -1.
View raw message