hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: -libjars?
Date Wed, 14 Sep 2011 12:04:42 GMT
When are you getting the exception? Is it during the setup of your
job, or after it's running on the cluster?

-Joey

On Wed, Sep 14, 2011 at 4:50 AM, Marco Didonna <m.didonna86@gmail.com> wrote:
> Hello everyone,
> sorry to bring this up again but I need some clarification. I wrote a
> map-reduce application that need cloud9 library
> (https://github.com/lintool/Cloud9). This library is packet in a jar
> file and I want to make it available to the whole cluster. So far I
> have been working in standalone mode and I have unsuccessfully tried
> to use the -libjars options. I always get ClassNotDefException: the
> only way I made everything work fine is by copying the cloud9.jar into
> hadoop/lib folder.
> I suppose I cannot do it when using a cluster of N machines since I
> would have to copy it on the N machines and this approach isn't
> feasible.
>
> Here's how I perform the job "hadoop jar myjob.jar
> myjob.driver.PreprocessANC -libjars ../umd-hadoop-core/cloud9.jar
> home/my/pyworkspace/openAnc.xml index/ 10 1"
>
> Is there some code that needs to be written in the driver in order to
> have the darn library added to the "global" classpath? This -libjars
> option is really poor documented IMHO.
>
> Any help would be very much appreciated ;)
>
> Marco Didonna
>
> On 17 August 2011 03:57, Anty <anty.rao@gmail.com> wrote:
>> Thanks very much , todd. I get it.
>>
>>
>> On Wed, Aug 17, 2011 at 6:23 AM, Todd Lipcon <todd@cloudera.com> wrote:
>>> Putting files on the classpath doesn't make them accessible to JVM's
>>> resource loader. If you have dir/foo.properties, then "dir" needs to
>>> be on the classpath, not "dir/foo.properties". Since the working dir
>>> of the task is on the classpath, then -files works since it gets the
>>> properties file into a directory on the classpath.
>>>
>>> -Todd
>>>
>>> On Mon, Aug 15, 2011 at 8:09 PM, Anty <anty.rao@gmail.com> wrote:
>>>> thanks very much for you reply, todd.
>>>> I am at a complete loss. I want to ship a configuration file to the
>>>> cluster to run my mapreduce job.
>>>>
>>>> if I use -libjars option to ship the configuration file, the launched
>>>> child JVM created  by task tracker
>>>>  can't find the configuration file,curiously, the configuration file
>>>> is already on the classpath of the child JVM.
>>>>
>>>> if I use -files option to ship the configuration file, the child JVM
>>>> can find the file.
>>>> IMO, what's the difference between -libjars and -files  is that -files
>>>> will create a  symbol sink  to the configuration file
>>>> in current workding directory of child JVM.
>>>>
>>>> I dig into the source code,but it's so complicated, i can't figure out
>>>> the root cause of this.
>>>> So my question is :
>>>> with -libjars option ,the configuration file is already on the
>>>> classpath, why classload can't the configuration file ,
>>>> but why JVM classload CAN find the shipped jar with -libjars option?
>>>>
>>>> any help will be appreciated.
>>>>
>>>> On Tue, Aug 16, 2011 at 1:06 AM, Todd Lipcon <todd@cloudera.com> wrote:
>>>>> Your "driver" is the program that submits the job. The task is the
>>>>> thing that runs on the cluster. They have separate classpaths.
>>>>>
>>>>> Better to ask on the public lists if you want a more indepth explanation
>>>>>
>>>>> -Todd
>>>>>
>>>>> On Mon, Aug 15, 2011 at 9:02 AM, Anty <anty.rao@gmail.com> wrote:
>>>>>> Hi:Todd
>>>>>> Would you please explain a litter more?
>>>>>>
>>>>>> On Sat, Dec 11, 2010 at 2:08 AM, Todd Lipcon <todd@cloudera.com>
wrote:
>>>>>>>
>>>>>>> You need to put the library jar on your classpath (eg using
>>>>>>> HADOOP_CLASSPATH) as well. The -libjars will ship it to the cluster
>>>>>>> and put it on the classpath of your task, but not the classpath
of
>>>>>>> your "driver" code.
>>>>>>>
>>>>>> I still can't understand you mean by  " but not the classpath of
>>>>>> your "driver" code."
>>>>>>
>>>>>> THX advance.
>>>>>>
>>>>>>
>>>>>>> -Todd
>>>>>>>
>>>>>>> On Thu, Dec 9, 2010 at 10:29 PM, Vipul Pandey <vipandey@gmail.com>
wrote:
>>>>>>> > disclaimer : a newbie!!!
>>>>>>> > Howdy?
>>>>>>> > Got a quick question. -libjars option doesn't seem to work
for me in -
>>>>>>> > prettymuch - my first (or mayby second) mapreduce job.
>>>>>>> > Here's what i'm doing :
>>>>>>> > $bin/hadoop jar  sherlock.jar somepkg.FindSchoolsJob -libjars
>>>>>>> >  HStats-1A18.jar input output
>>>>>>> >
>>>>>>> > sherlock.jar has my main class (ofcourse)  FindSchoolsJob,
which runs
>>>>>>> > just
>>>>>>> > fine by itself till I add a dependency on a class in HStats-1A18.jar.
>>>>>>> > When I run the above command with -libjars specified - it
fails to find
>>>>>>> > my
>>>>>>> > classes that 'are' inside HStats jar file.
>>>>>>> > Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>> > com/*****/HAgent
>>>>>>> > at com.*****.FindSchoolsJob.run(FindSchoolsJob.java:46)
>>>>>>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>>>>> > at com.******.FindSchoolsJob.main(FindSchoolsJob.java:101)
>>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>> > at
>>>>>>> >
>>>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>> > at
>>>>>>> >
>>>>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>> > at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>>>>>> > Caused by: java.lang.ClassNotFoundException:com/*****/HAgent
>>>>>>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>>>>>> > at java.security.AccessController.doPrivileged(Native Method)
>>>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>>>>>>> > ... 8 more
>>>>>>> >
>>>>>>> > My main class is defined as below :
>>>>>>> > public class FindSchoolsJob extends Configured implements Tool
{
>>>>>>> > :
>>>>>>> > public int run(String[] args) throws Exception {
>>>>>>> > :
>>>>>>> > :
>>>>>>> >               }
>>>>>>> > :
>>>>>>> > public static void main(String[] args) throws Exception
{
>>>>>>> > int res = ToolRunner.run(new Configuration(), new FindSchoolsJob(),
>>>>>>> > args);
>>>>>>> > System.exit(res);
>>>>>>> > }
>>>>>>> > }
>>>>>>> > Any hint would be highly appreciated.
>>>>>>> > Thank You!
>>>>>>> > ~V
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Todd Lipcon
>>>>>>> Software Engineer, Cloudera
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards
>>>>>> Anty Rao
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Todd Lipcon
>>>>> Software Engineer, Cloudera
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards
>>>> Anty Rao
>>>>
>>>
>>>
>>>
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>>>
>>
>>
>>
>> --
>> Best Regards
>> Anty Rao
>>
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Mime
View raw message