hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Bacsko (JIRA)" <j...@apache.org>
Subject [jira] [Created] (MAPREDUCE-7046) Enhance logging related to retrieving Job
Date Fri, 02 Feb 2018 11:10:02 GMT
Peter Bacsko created MAPREDUCE-7046:
---------------------------------------

             Summary: Enhance logging related to retrieving Job
                 Key: MAPREDUCE-7046
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7046
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: client
            Reporter: Peter Bacsko
            Assignee: Peter Bacsko


We recently encountered an interesting problem. In one case, Hive Driver was unable to retrieve
the status of a MapReduce job. The following stack trace was printed:

{noformat}
[main] INFO  org.apache.hadoop.hive.ql.exec.Task  - 2018-01-15 00:18:09,324 Stage-2 map =
0%,  reduce = 0%, Cumulative CPU 1679.31 sec
 [main] ERROR org.apache.hadoop.hive.ql.exec.Task  - Ended Job = job_1511036412170_1322169
with exception 'java.io.IOException(Could not find status of job:job_1511036412170_1322169)'
java.io.IOException: Could not find status of job:job_1511036412170_1322169
	at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:295)
	at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:549)
	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:435)
	at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1782)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1539)
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1318)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1127)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1115)
	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220)
	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383)
	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:318)
	at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:416)
	at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:432)
	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:726)
	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693)
	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628)
	at org.apache.oozie.action.hadoop.HiveMain.runHive(HiveMain.java:325)
	at org.apache.oozie.action.hadoop.HiveMain.run(HiveMain.java:302)
	at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:49)
{noformat}

We examined the logs from JHS and AM, but haven't seen anything suspicious. For some reason
a {{null}} was returned but it's not obvious why. The MR job was running at this point.

Some ideas:
1. We already have logging in place related to JobClient->AM and JobClient->JHS communication,
but that's on TRACE level and that could be too low. It might make more sense to raise the
level to DEBUG.

2. We need new {{LOG.debug()}} calls at some crucial points




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message