accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Reichman <>
Subject Re: submission w/classpath without
Date Mon, 26 Jan 2015 14:19:46 GMT
So, mapreduce.application.classpath was the winner. It's possible that
yarn.application.classpath would have worked as well. My main issue was
that I was neglecting to include a copy of the XML files in classpath, so
my settings weren't being taken, late night epiphany. Passing the value as
-Dmapreduce.application.classpath=... on the command line allowed this to
take effect and I was fine.

For remote clients, I have copied into a local classpath lib what I need to
launch, the jar list output from accumulo classpath, and a set of the XML
files needed to set the appropriate client-side mapreduce options to launch
properly, including the classpath mentioned above but also the various
memory-related settings in YARN/MR2.

Thanks for the help Billie!

On Sat, Jan 24, 2015 at 7:51 AM, Billie Rinaldi <> wrote:

> You might have to set yarn.application.classpath in both the client and
> the server conf. At least that's what Slider does.
> On Jan 23, 2015 10:00 PM, "Marc Reichman" <>
> wrote:
>> That's correct, I don't really want to have the client have to package up
>> every accumulo and zookeeper jar I need in dcache or a fat jar or whatever
>> just to run stuff from a remote client when the jars are all there.
>> I did try yarn.application.classpath, but I didn't spell out the whole
>> thing. Next try I will take all those jars and put them in explicitly
>> instead of the dir wildcards. I will update how it goes.
>> On Fri, Jan 23, 2015 at 5:19 PM, Billie Rinaldi <>
>> wrote:
>>> You have all the jars your app needs on both the servers and the client
>>> (as opposed to wanting Yarn to distribute them)?  Then
>>> yarn.application.classpath should be what you need.  It looks like
>>> /etc/hadoop/conf,/some/lib/dir/*,/some/other/lib/dir/* etc.  Is that what
>>> you're trying?
>>> On Fri, Jan 23, 2015 at 1:56 PM, Marc Reichman <
>>>> wrote:
>>>> My apologies if this is covered somewhere, I've done a lot of searching
>>>> and come up dry.
>>>> I am migrating a set of applications from Hadoop 1.0.3/Accumulo 1.4.1
>>>> to Hadoop 2.6.0/Accumulo 1.6.1. The applications are launched by my custom
>>>> java apps, using the Hadoop Tool/Configured interface setup, not a big deal.
>>>> To run MR jobs with AccumuloInputFormat/OutputFormat, in 1.0 I could
>>>> use to launch the programs, which worked great for local on-cluster
>>>> launching. I however needed to launch from remote hosts (maybe even Windows
>>>> ones), and I would bundle a large lib dir with everything I needed on the
>>>> client-side, and fill out HADOOP_CLASSPATH in with everything
>>>> I needed (basically copied the output of accumulo classpath). This would
>>>> work for remote submissions, or even local ones, but specifically using my
>>>> java mains to launch them without any accumulo or hadoop wrapper scripts.
>>>> In YARN MR 2.6 this doesn't seem to work. No matter what I do, I can't
>>>> seem to get a normal java app to have the 2.x MR Application Master pick
>>>> the accumulo items in the classpath, and my jobs fail with ClassNotFound
>>>> exceptions. works just fine, but again, I need to be able to submit
>>>> without that environment.
>>>> I have tried (on the cluster):
>>>> HADOOP_CLASSPATH from .bashrc
>>>> yarn.application.classpath in yarn-site.xml
>>>> I don't mind using locally, it's quite nice, but I need a
>>>> strategy to have the cluster "setup" so I can just launch java, set my
>>>> appropriate hadoop configs for remote fs and yarn hosts, get my accumulo
>>>> connections and in/out setup for mapreduce and launch jobs which have
>>>> accumulo awareness.
>>>> Any ideas?
>>>> Thanks,
>>>> Marc

View raw message