hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Ross <br...@Lattice-Engines.com>
Subject RE: Issue with Hadoop Job History Server
Date Fri, 19 Aug 2016 01:24:04 GMT
Turns out we made a stupid mistake - our system was managing to mix configuration between an
old cluster and a new cluster.  So, things are working now.

Thanks,
Ben
________________________________
From: Benjamin Ross
Sent: Thursday, August 18, 2016 10:05 AM
To: Rohith Sharma K S; Gao, Yunlong
Cc: user@hadoop.apache.org
Subject: RE: Issue with Hadoop Job History Server

Rohith,
Thanks - we're still having issues.  Can you help out with this?

How do you specify the done directory for an MR job?  The job history done dir is mapreduce.jobhistory.done-dir.
 I specified the job one as mapreduce.jobtracker.jobhistory.location as per the documentation
here.
https://hadoop.apache.org/docs/r2.7.1/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

They're both set to the same thing.  I did a recursive ls on hadoop and it doesn't seem like
there are any directories called "done" with recent data in them.  All of the data in /mr-history
is old.  Here's a summary of that ls:

drwx------   - yarn          hadoop          0 2016-07-14 16:39 /ats/done
drwxr-xr-x   - yarn          hadoop          0 2016-07-14 16:39 /ats/done/1468528507723
drwxr-xr-x   - yarn          hadoop          0 2016-07-14 16:39 /ats/done/1468528507723/0000
drwxr-xr-x   - yarn          hadoop          0 2016-07-25 20:10 /ats/done/1468528507723/0000/000
drwxrwxrwx   - mapred        hadoop          0 2016-07-19 14:47 /mr-history/done
drwxrwx---   - mapred        hadoop          0 2016-07-19 14:47 /mr-history/done/2016
drwxrwx---   - mapred        hadoop          0 2016-07-19 14:47 /mr-history/done/2016/07
drwxrwx---   - mapred        hadoop          0 2016-07-27 13:49 /mr-history/done/2016/07/19
drwxrwxrwt   - bross         hdfs            0 2016-08-15 22:39 /tmp/hadoop-yarn/staging/history/done_intermediate
       =========> lots of recent data in /tmp/hadoop-yarn/staging/history/done_intermediate

Here's our mapred-site.xml:

  <configuration>

    <property>
      <name>mapreduce.admin.map.child.java.opts</name>
      <value>-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.6.0-3796</value>
    </property>

    <property>
      <name>mapreduce.admin.reduce.child.java.opts</name>
      <value>-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.6.0-3796</value>
    </property>

    <property>
      <name>mapreduce.admin.user.env</name>
      <value>LD_LIBRARY_PATH=/usr/hdp/2.3.6.0-3796/hadoop/lib/native:/usr/hdp/2.3.6.0-3796/hadoop/lib/native/Linux-amd64-64</value>
    </property>

    <property>
      <name>mapreduce.am.max-attempts</name>
      <value>2</value>
    </property>

    <property>
      <name>mapreduce.application.classpath</name>
      <value>$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/2.3.6.0-3796/hadoop/lib/hadoop-lzo-0.6.0.2.3.6.0-3796.jar:/etc/hadoop/conf/secure</value>
    </property>

    <property>
      <name>mapreduce.application.framework.path</name>
      <value>/hdp/apps/2.3.6.0-3796/mapreduce/mapreduce.tar.gz#mr-framework</value>
    </property>

    <property>
      <name>mapreduce.cluster.administrators</name>
      <value> hadoop</value>
    </property>

    <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
    </property>

    <property>
      <name>mapreduce.job.counters.max</name>
      <value>130</value>
    </property>

    <property>
      <name>mapreduce.job.emit-timeline-data</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.job.reduce.slowstart.completedmaps</name>
      <value>0.05</value>
    </property>

    <property>
      <name>mapreduce.job.user.classpath.first</name>
      <value>true</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.address</name>
      <value>bodcdevhdp6.dev.lattice.local:10020</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.bind-host</name>
      <value>0.0.0.0</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.done-dir</name>
      <value>/mr-history/done</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.intermediate-done-dir</name>
      <value>/mr-history/tmp</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.recovery.enable</name>
      <value>true</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.recovery.store.class</name>
      <value>org.apache.hadoop.mapreduce.v2.hs.HistoryServerLeveldbStateStoreService</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.recovery.store.leveldb.path</name>
      <value>/hadoop/mapreduce/jhs</value>
    </property>

    <property>
      <name>mapreduce.jobhistory.webapp.address</name>
      <value>bodcdevhdp6.dev.lattice.local:19888</value>
    </property>

    <property>
      <name>mapreduce.jobtracker.jobhistory.completed.location</name>
      <value>/mr-history/done</value>
    </property>

    <property>
      <name>mapreduce.map.java.opts</name>
      <value>-Xmx4915m</value>
    </property>

    <property>
      <name>mapreduce.map.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>mapreduce.map.memory.mb</name>
      <value>6144</value>
    </property>

    <property>
      <name>mapreduce.map.output.compress</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.map.sort.spill.percent</name>
      <value>0.7</value>
    </property>

    <property>
      <name>mapreduce.map.speculative</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.output.fileoutputformat.compress</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.output.fileoutputformat.compress.type</name>
      <value>BLOCK</value>
    </property>

    <property>
      <name>mapreduce.reduce.input.buffer.percent</name>
      <value>0.0</value>
    </property>

    <property>
      <name>mapreduce.reduce.java.opts</name>
      <value>-Xmx9830m</value>
    </property>

    <property>
      <name>mapreduce.reduce.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>mapreduce.reduce.memory.mb</name>
      <value>12288</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.fetch.retry.enabled</name>
      <value>1</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.fetch.retry.interval-ms</name>
      <value>1000</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.fetch.retry.timeout-ms</name>
      <value>30000</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.input.buffer.percent</name>
      <value>0.7</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.merge.percent</name>
      <value>0.66</value>
    </property>

    <property>
      <name>mapreduce.reduce.shuffle.parallelcopies</name>
      <value>30</value>
    </property>

    <property>
      <name>mapreduce.reduce.speculative</name>
      <value>false</value>
    </property>

    <property>
      <name>mapreduce.shuffle.port</name>
      <value>13562</value>
    </property>

    <property>
      <name>mapreduce.task.io.sort.factor</name>
      <value>100</value>
    </property>

    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>2047</value>
    </property>

    <property>
      <name>mapreduce.task.timeout</name>
      <value>300000</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.admin-command-opts</name>
      <value>-Dhdp.version=2.3.6.0-3796</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.command-opts</name>
      <value>-Xmx4915m -Dhdp.version=${hdp.version}</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.log.level</name>
      <value>INFO</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.resource.mb</name>
      <value>6144</value>
    </property>

    <property>
      <name>yarn.app.mapreduce.am.staging-dir</name>
      <value>/user</value>
    </property>

  </configuration>

Thanks,
Ben

________________________________
From: Rohith Sharma K S [ksrohithsharma@gmail.com]
Sent: Thursday, August 18, 2016 3:17 AM
To: Gao, Yunlong
Cc: user@hadoop.apache.org; Benjamin Ross
Subject: Re: Issue with Hadoop Job History Server

MR jobs and JHS should have same configurations for done-dir if configured. Otherwise staging-dir
should be same for both. Make sure both Job and JHS has same configurations value.

Usually what would happen is , MRApp writes job file in one location and HistoryServer trying
to read from different location. This causes, JHS to display empty jobs.

Thanks & Regards
Rohith Sharma K S

On Aug 18, 2016, at 12:35 PM, Gao, Yunlong <dg.gaoyunlong@gmail.com<mailto:dg.gaoyunlong@gmail.com>>
wrote:

To whom it may concern,

I am using Hadoop 2.7.1.2.3.6.0-3796, with the Hortonworks distribution of HDP-2.3.6.0-3796.
I have a question with the Hadoop Job History sever.

After I set up everything, the resource manager/name nodes/data nodes seem to be running fine.
But the job history server is not working correctly.  The issue with it is that the UI of
the job history server does not show any jobs.  And all the rest calls to the job history
server do not work either. Also notice that there is no logs in HDFS under the directory of
"mapreduce.jobhistory.done-dir"

I have tried with different things, including restarting the job history server and monitor
the log -- no error/exceptions is observed. I also rename the /hadoop/mapreduce/jhs/mr-jhs-state
for the state recovery of job history server, and then restart it again, but no particular
error happens. I tried with some other random stuff that I borrowed from online blogs/documents
but got no luck.


Any help would be very much appreciated.

Thanks,
Yunlong





Click here<https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> to report this email
as spam.


This message has been scanned for malware by Websense. www.websense.com

Mime
View raw message