hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bobby Dennett" <softw...@bobby.fastmail.us>
Subject Re: Preventing/Limiting NotReplicatedYetException exceptions
Date Fri, 30 Jul 2010 07:26:35 GMT
Thanks the information, Alex.

I have mostly seen the NotReplicatedYetException issue with reduce
tasks. We disabled speculative execution for reduce tasks earlier this
evening so we'll see if there is an impact within the next day or so.


On Mon, 26 Jul 2010 11:37 -0700, "Alex Kozlov"
<alexvk@cloudera.com> wrote:

  Hi Bobby,

On Mon, Jul 26, 2010 at 10:32 AM, Bobby Dennett
<[1]software@bobby.fastmail.us> wrote:

  Just following up again as this issue is becoming a high
  priority for us
  since it is affecting a critical process...
  Can anyone provide some insight as to what you would look for
  in the
  logs to troubleshoot this issue?

Up until the occurrences of the NotReplicatedYetException
exception, we
only have "java.io.IOException: Could not complete write to file"
errors
referring to files that correspond to killed tasks (e.g.
tasks/attempts
launched due to speculative execution).


Is it map or reduce speculative execution?  Which file was it
writing to?  I noticed a few problems with reduce speculative
execution in the past (specifically, cleaning up).  can you try
to switch off reduce spec exectution off and see if the problem
goes away (set mapred.reduce.tasks.speculative.execution to
false)?



  Lastly, would it make sense to upgrade to the latest version
  of v0.20.1
  Cloudera Hadoop?

I would say yes: the latest release hadoop-0.20.2+320 is mostly
bug fixes, although I do not know anything specific for the
problem you are having


  Thanks in advance,

-Bobby


Thanks,
Alex K

References

1. mailto:software@bobby.fastmail.us

Mime
View raw message