hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Havens <timhav...@gmail.com>
Subject Re: Long running Join Query - Reduce task fails due to failing to report status
Date Fri, 24 Aug 2012 17:20:16 GMT
Just curious if you've tried using Hive's explain method to see what IT
thinks of your query.


On Fri, Aug 24, 2012 at 9:36 AM, Himanish Kushary <himanish@gmail.com>wrote:

> Hi,
>
> We have a complex query that involves several left outer joins resulting
> in 8 M/R jobs in Hive.During execution of one of the stages ( after three
> M/R has run) the M/R job fails due to few Reduce tasks failing due to
> inactivity.
>
> Most of the reduce tasks go through fine ( within 3 mins) but the last one
> gets stuck for a long time (> 1 hour) and finally after several attempts
> gets killed due to "failed to report status for 600 seconds. Killing!"
>
> What may be causing this issue ? Would hive.script.auto.progress help in
> this case ? As we are not able to get much information from the log files
> how may we approach resolving this ? Will tweaking of any specific M/R
> parameters help ?
>
> The task attempt log shows several lines like this before exiting :
>
> 2012-08-23 19:17:23,848 INFO ExecReducer: ExecReducer: processing 219000000 rows: used
memory = 408582240
> 2012-08-23 19:17:30,189 INFO ExecReducer: ExecReducer: processing 220000000 rows: used
memory = 346110400
> 2012-08-23 19:17:37,510 INFO ExecReducer: ExecReducer: processing 221000000 rows: used
memory = 583913576
> 2012-08-23 19:17:44,829 INFO ExecReducer: ExecReducer: processing 222000000 rows: used
memory = 513071504
> 2012-08-23 19:17:47,923 INFO org.apache.hadoop.mapred.FileInputFormat: Total input paths
to process : 1
>
> Here are the reduce task counters:
>
> *Map-Reduce Framework* Combine input records0 Combine output records0Reduce input groups
> 222,480,335 Reduce shuffle bytes7,726,141,897 Reduce input records
> 222,480,335 Reduce output records0 Spilled Records355,827,191 CPU time
> spent (ms)2,152,160 Physical memory (bytes) snapshot1,182,490,624 Virtual
> memory (bytes) snapshot1,694,531,584 Total committed heap usage (bytes)
> 990,052,352
>
> The tasktracker log gives a thread dump at that time but no exception.
>
> *2012-08-23 20:05:49,319 INFO org.apache.hadoop.mapred.TaskTracker:
> Process Thread Dump: lost task*
> *69 active threads*
>
> ---------------------------
> Thanks & Regards
> Himanish
>



-- 
"The whole world is you. Yet you keep thinking there is something else." -
Xuefeng Yicun 822-902 A.D.

Tim R. Havens
Google Phone: 573.454.1232
ICQ: 495992798
ICBM:  37°51'34.79"N   90°35'24.35"W
ham radio callsign: NW0W

Mime
View raw message