hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Venner <jason.had...@gmail.com>
Subject Re: Streaming ignoring stderr output
Date Tue, 27 Oct 2009 14:48:05 GMT
Most likely one gets buffered when the file descriptor is a pipe and the
other is at most line buffered as it is when the code is run by the
streaming mapper tsak.

On Mon, Oct 26, 2009 at 11:06 AM, Ryan Rosario <uclamathguy@gmail.com>wrote:

> Thanks. I think that I may have tripped on some sort of bug.
> Unfortunately, I do not know how to reproduce it and am a bit scared
> to try to reproduce it.
>
> I got this to work. I changed the following things, and now my job
> completes successfully with stderr written to the logs as output
> occurs. What was happening before was that no output was being written
> to the stderr logs until a map task completely finished.
>
> -- used print >> sys.stderr, "blah blah" instead of
> sys.stderr.write("blah blah")
> -- used the reporter:  print >> sys.stderr, "reporter:status:My status
> message"
> -- used only one large input file, rather than splitting the file into
> n files. I thought that was required so I could force n mappers, but
> apparently not.
>
> I am not sure which one of the above solved the problem. Using
> sys.stderr.write() without the reporting format worked for some time.
> I don't know why.
>
> - Ryan
>
> On Mon, Oct 26, 2009 at 8:03 AM, Koji Noguchi <knoguchi@yahoo-inc.com>
> wrote:
> > This doesn't solve your stderr/stdout problem, but you can always set the
> > timeout to be a bigger value if necessary.
> >
> > -Dmapred.task.timeout=______ (in milliseconds)
> >
> > Koji
> >
> >
> > On 10/25/09 12:00 PM, "Ryan Rosario" <uclamathguy@gmail.com> wrote:
> >
> >> I am using a Python script as a mapper for a Hadoop Streaming (hadoop
> >> 0.20.0) job, with reducer NONE. My jobs keep getting killed with "task
> >> failed to respond after 600 seconds." I tried sending a heartbeat
> >> every minute to stderr using sys.stderr.write in my mapper, but
> >> nothing is being output to stderr either on disk (in
> >> logs/userlogs/...) or in the web UI. stdout is not even recorded.
> >>
> >> This also means I have no way of knowing what my tasks are doing at
> >> any given moment except to look at the counts produced in syslog.
> >>
> >> I got it to work once, but have not had any luck since. Any
> >> suggestions of things to look at as to why I am not able to get any
> >> output? Help is greatly appreciated.
> >>
> >> - Ryan
> >
> >
>
>
>
> --
> RRR
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message