pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-14) large key cause pig reduce jobs to die
Date Fri, 02 Nov 2007 03:09:50 GMT
large key cause pig reduce jobs to die

                 Key: PIG-14
                 URL: https://issues.apache.org/jira/browse/PIG-14
             Project: Pig
          Issue Type: Bug
          Components: impl
            Reporter: Olga Natkovich

The reducer sends a heartbeat to the task tracker every time it starts processing new key.
The task tracker expects to
get a message every 10 minutes. If processing of an individual key takes longer, which could
be the case for your job,
the task tracker would not get a heartbeat in time and would kill the task.

The current patch is to add <property>
	<description>timeout value</description>

to the cluster's hadoop-site.xml. This results in disabling heartbeat functionality which
might not be what we want
long term.

A more flexible approach is to periodically report from map and reduce job via

As a workaround for a UDF, call: PigMapReduce.reporter.progress() every 1000th time

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message