hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "S. Alex Smith (JIRA)" <>
Subject [jira] Commented: (HIVE-797) mappers should report life in ways other than emitting data
Date Wed, 26 Aug 2009 08:43:59 GMT


S. Alex Smith commented on HIVE-797:



seems to have no effect.  Aside from the job succeeding (it doesn't), what effect should I
be able to measure (in order to see if this is doing anything)?

> mappers should report life in ways other than emitting data
> -----------------------------------------------------------
>                 Key: HIVE-797
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: S. Alex Smith
> Mappers which are performing a great deal of aggregation can be killed by time out even
if they are running successfully.  For example, in the following query the group by operator
stops the mapper from returning any rows of data until the map is entirely finished.  If the
data processing takes longer than the time-out limit, the job will fail.  The mapper should
instead offer the tracker some indication that it is busy working.  Alternatively, the tracker
could ping the mapper with an appropriate question / warning before it sends a kill signal.
> FROM (
>   FROM my_table
>   USING 'my_boolean_function'
>   AS boolean_output) a
> SELECT boolean_output, COUNT(1)
> GROUP BY boolean_output

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message