accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Reichman <mreich...@pixelforensics.com>
Subject submission w/classpath without tool.sh?
Date Fri, 23 Jan 2015 21:56:29 GMT
My apologies if this is covered somewhere, I've done a lot of searching and
come up dry.

I am migrating a set of applications from Hadoop 1.0.3/Accumulo 1.4.1 to
Hadoop 2.6.0/Accumulo 1.6.1. The applications are launched by my custom
java apps, using the Hadoop Tool/Configured interface setup, not a big deal.

To run MR jobs with AccumuloInputFormat/OutputFormat, in 1.0 I could use
tool.sh to launch the programs, which worked great for local on-cluster
launching. I however needed to launch from remote hosts (maybe even Windows
ones), and I would bundle a large lib dir with everything I needed on the
client-side, and fill out HADOOP_CLASSPATH in hadoop-env.sh with everything
I needed (basically copied the output of accumulo classpath). This would
work for remote submissions, or even local ones, but specifically using my
java mains to launch them without any accumulo or hadoop wrapper scripts.

In YARN MR 2.6 this doesn't seem to work. No matter what I do, I can't seem
to get a normal java app to have the 2.x MR Application Master pick up the
accumulo items in the classpath, and my jobs fail with ClassNotFound
exceptions. tool.sh works just fine, but again, I need to be able to submit
without that environment.

I have tried (on the cluster):
HADOOP_CLASSPATH in hadoop-env.sh
HADOOP_CLASSPATH from .bashrc
yarn.application.classpath in yarn-site.xml

I don't mind using tool.sh locally, it's quite nice, but I need a strategy
to have the cluster "setup" so I can just launch java, set my appropriate
hadoop configs for remote fs and yarn hosts, get my accumulo connections
and in/out setup for mapreduce and launch jobs which have accumulo
awareness.

Any ideas?

Thanks,
Marc

Mime
View raw message