hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Pankov <apan...@iponweb.net>
Subject Re: Streaming and subprocess error code
Date Wed, 14 May 2008 15:19:20 GMT

I've tested this new option "-jobconf 
stream.non.zero.exit.status.is.failure=true". Seems working but still 
not good for me. When mapper/reducer program have read all input data 
successfully and fails after that, streaming still finishes successfully 
so there are no chances to know about some data post-processing errors 
in subprocesses :(

Andrey Pankov wrote:
> Hi Rick,
> Thank you for the quick response! I see this feature is in trunk and not 
> available in last stable release. Anyway will try if it works for me 
> from the trunk, and will try does it catch segmentation faults too.
> Rick Cox wrote:
>> Try "-jobconf stream.non.zero.exit.status.is.failure=true".
>> That will tell streaming that a non-zero exit is a task failure. To
>> turn that into an immediate whole job failure, I think configuring 0
>> task retries (mapred.map.max.attempts=1 and
>> mapred.reduce.max.attempts=1) will be sufficient.
>> rick
>> On Tue, May 13, 2008 at 8:15 PM, Andrey Pankov <apankov@iponweb.net> 
>> wrote:
>>> Hi all,
>>>  I'm looking a way to force Streaming to shutdown the whole job in 
>>> case when
>>> some of its subprocesses exits with non-zero error code.
>>>  We have next situation. Sometimes either mapper or reducer could 
>>> crush, as
>>> a rule it returns some exit code. In this case entire streaming job 
>>> finishes
>>> successfully, but that's wrong. Almost the same when any subprocess 
>>> finishes
>>> with segmentation fault.
>>>  It's possible to check automatically if a subprocess crushed only 
>>> via logs
>>> but it means you need to parse tons of outputs/logs/dirs/etc.
>>>  In order to find logs of your job you have to know it's jobid ~
>>> job_200805130853_0016. I don't know easy way to determine it - just scan
>>> stdout for the pattern. Then find logs of each mapper, each reducer, 
>>> find a
>>> way to parse them, etc, etc...
>>>  So, is there any easiest way get correct status of the whole 
>>> streaming job
>>> or I still have to build rather fragile parsing systems for such 
>>> purposes?
>>>  Thanks in advance.
>>>  --
>>>  Andrey Pankov

Andrey Pankov

View raw message