accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Billie Rinaldi <bil...@apache.org>
Subject Re: submission w/classpath without tool.sh?
Date Sat, 24 Jan 2015 13:51:03 GMT
You might have to set yarn.application.classpath in both the client and the
server conf. At least that's what Slider does.
On Jan 23, 2015 10:00 PM, "Marc Reichman" <mreichman@pixelforensics.com>
wrote:

> That's correct, I don't really want to have the client have to package up
> every accumulo and zookeeper jar I need in dcache or a fat jar or whatever
> just to run stuff from a remote client when the jars are all there.
>
> I did try yarn.application.classpath, but I didn't spell out the whole
> thing. Next try I will take all those jars and put them in explicitly
> instead of the dir wildcards. I will update how it goes.
>
> On Fri, Jan 23, 2015 at 5:19 PM, Billie Rinaldi <billie@apache.org> wrote:
>
>> You have all the jars your app needs on both the servers and the client
>> (as opposed to wanting Yarn to distribute them)?  Then
>> yarn.application.classpath should be what you need.  It looks like
>> /etc/hadoop/conf,/some/lib/dir/*,/some/other/lib/dir/* etc.  Is that what
>> you're trying?
>>
>> On Fri, Jan 23, 2015 at 1:56 PM, Marc Reichman <
>> mreichman@pixelforensics.com> wrote:
>>
>>> My apologies if this is covered somewhere, I've done a lot of searching
>>> and come up dry.
>>>
>>> I am migrating a set of applications from Hadoop 1.0.3/Accumulo 1.4.1 to
>>> Hadoop 2.6.0/Accumulo 1.6.1. The applications are launched by my custom
>>> java apps, using the Hadoop Tool/Configured interface setup, not a big deal.
>>>
>>> To run MR jobs with AccumuloInputFormat/OutputFormat, in 1.0 I could use
>>> tool.sh to launch the programs, which worked great for local on-cluster
>>> launching. I however needed to launch from remote hosts (maybe even Windows
>>> ones), and I would bundle a large lib dir with everything I needed on the
>>> client-side, and fill out HADOOP_CLASSPATH in hadoop-env.sh with everything
>>> I needed (basically copied the output of accumulo classpath). This would
>>> work for remote submissions, or even local ones, but specifically using my
>>> java mains to launch them without any accumulo or hadoop wrapper scripts.
>>>
>>> In YARN MR 2.6 this doesn't seem to work. No matter what I do, I can't
>>> seem to get a normal java app to have the 2.x MR Application Master pick up
>>> the accumulo items in the classpath, and my jobs fail with ClassNotFound
>>> exceptions. tool.sh works just fine, but again, I need to be able to submit
>>> without that environment.
>>>
>>> I have tried (on the cluster):
>>> HADOOP_CLASSPATH in hadoop-env.sh
>>> HADOOP_CLASSPATH from .bashrc
>>> yarn.application.classpath in yarn-site.xml
>>>
>>> I don't mind using tool.sh locally, it's quite nice, but I need a
>>> strategy to have the cluster "setup" so I can just launch java, set my
>>> appropriate hadoop configs for remote fs and yarn hosts, get my accumulo
>>> connections and in/out setup for mapreduce and launch jobs which have
>>> accumulo awareness.
>>>
>>> Any ideas?
>>>
>>> Thanks,
>>> Marc
>>>
>>
>>
>

Mime
View raw message