hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco Didonna <m.didonn...@gmail.com>
Subject Re: -libjars?
Date Thu, 15 Sep 2011 08:24:05 GMT
Right now I am still in standalone mode ... I'd like to fix this issue
before starting a cluster on EC2. :)

Thanks for your time

Marco

On 14 September 2011 14:04, Joey Echeverria <joey@cloudera.com> wrote:
> When are you getting the exception? Is it during the setup of your
> job, or after it's running on the cluster?
>
> -Joey
>
> On Wed, Sep 14, 2011 at 4:50 AM, Marco Didonna <m.didonna86@gmail.com> wrote:
>> Hello everyone,
>> sorry to bring this up again but I need some clarification. I wrote a
>> map-reduce application that need cloud9 library
>> (https://github.com/lintool/Cloud9). This library is packet in a jar
>> file and I want to make it available to the whole cluster. So far I
>> have been working in standalone mode and I have unsuccessfully tried
>> to use the -libjars options. I always get ClassNotDefException: the
>> only way I made everything work fine is by copying the cloud9.jar into
>> hadoop/lib folder.
>> I suppose I cannot do it when using a cluster of N machines since I
>> would have to copy it on the N machines and this approach isn't
>> feasible.
>>
>> Here's how I perform the job "hadoop jar myjob.jar
>> myjob.driver.PreprocessANC -libjars ../umd-hadoop-core/cloud9.jar
>> home/my/pyworkspace/openAnc.xml index/ 10 1"
>>
>> Is there some code that needs to be written in the driver in order to
>> have the darn library added to the "global" classpath? This -libjars
>> option is really poor documented IMHO.
>>
>> Any help would be very much appreciated ;)
>>
>> Marco Didonna
>>
>> On 17 August 2011 03:57, Anty <anty.rao@gmail.com> wrote:
>>> Thanks very much , todd. I get it.
>>>
>>>
>>> On Wed, Aug 17, 2011 at 6:23 AM, Todd Lipcon <todd@cloudera.com> wrote:
>>>> Putting files on the classpath doesn't make them accessible to JVM's
>>>> resource loader. If you have dir/foo.properties, then "dir" needs to
>>>> be on the classpath, not "dir/foo.properties". Since the working dir
>>>> of the task is on the classpath, then -files works since it gets the
>>>> properties file into a directory on the classpath.
>>>>
>>>> -Todd
>>>>
>>>> On Mon, Aug 15, 2011 at 8:09 PM, Anty <anty.rao@gmail.com> wrote:
>>>>> thanks very much for you reply, todd.
>>>>> I am at a complete loss. I want to ship a configuration file to the
>>>>> cluster to run my mapreduce job.
>>>>>
>>>>> if I use -libjars option to ship the configuration file, the launched
>>>>> child JVM created  by task tracker
>>>>>  can't find the configuration file,curiously, the configuration file
>>>>> is already on the classpath of the child JVM.
>>>>>
>>>>> if I use -files option to ship the configuration file, the child JVM
>>>>> can find the file.
>>>>> IMO, what's the difference between -libjars and -files  is that -files
>>>>> will create a  symbol sink  to the configuration file
>>>>> in current workding directory of child JVM.
>>>>>
>>>>> I dig into the source code,but it's so complicated, i can't figure out
>>>>> the root cause of this.
>>>>> So my question is :
>>>>> with -libjars option ,the configuration file is already on the
>>>>> classpath, why classload can't the configuration file ,
>>>>> but why JVM classload CAN find the shipped jar with -libjars option?
>>>>>
>>>>> any help will be appreciated.
>>>>>
>>>>> On Tue, Aug 16, 2011 at 1:06 AM, Todd Lipcon <todd@cloudera.com>
wrote:
>>>>>> Your "driver" is the program that submits the job. The task is the
>>>>>> thing that runs on the cluster. They have separate classpaths.
>>>>>>
>>>>>> Better to ask on the public lists if you want a more indepth explanation
>>>>>>
>>>>>> -Todd
>>>>>>
>>>>>> On Mon, Aug 15, 2011 at 9:02 AM, Anty <anty.rao@gmail.com>
wrote:
>>>>>>> Hi:Todd
>>>>>>> Would you please explain a litter more?
>>>>>>>
>>>>>>> On Sat, Dec 11, 2010 at 2:08 AM, Todd Lipcon <todd@cloudera.com>
wrote:
>>>>>>>>
>>>>>>>> You need to put the library jar on your classpath (eg using
>>>>>>>> HADOOP_CLASSPATH) as well. The -libjars will ship it to the
cluster
>>>>>>>> and put it on the classpath of your task, but not the classpath
of
>>>>>>>> your "driver" code.
>>>>>>>>
>>>>>>> I still can't understand you mean by  " but not the classpath
of
>>>>>>> your "driver" code."
>>>>>>>
>>>>>>> THX advance.
>>>>>>>
>>>>>>>
>>>>>>>> -Todd
>>>>>>>>
>>>>>>>> On Thu, Dec 9, 2010 at 10:29 PM, Vipul Pandey <vipandey@gmail.com>
wrote:
>>>>>>>> > disclaimer : a newbie!!!
>>>>>>>> > Howdy?
>>>>>>>> > Got a quick question. -libjars option doesn't seem to
work for me in -
>>>>>>>> > prettymuch - my first (or mayby second) mapreduce job.
>>>>>>>> > Here's what i'm doing :
>>>>>>>> > $bin/hadoop jar  sherlock.jar somepkg.FindSchoolsJob
-libjars
>>>>>>>> >  HStats-1A18.jar input output
>>>>>>>> >
>>>>>>>> > sherlock.jar has my main class (ofcourse)  FindSchoolsJob,
which runs
>>>>>>>> > just
>>>>>>>> > fine by itself till I add a dependency on a class in HStats-1A18.jar.
>>>>>>>> > When I run the above command with -libjars specified
- it fails to find
>>>>>>>> > my
>>>>>>>> > classes that 'are' inside HStats jar file.
>>>>>>>> > Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>>>> > com/*****/HAgent
>>>>>>>> > at com.*****.FindSchoolsJob.run(FindSchoolsJob.java:46)
>>>>>>>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>>>>>> > at com.******.FindSchoolsJob.main(FindSchoolsJob.java:101)
>>>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
>>>>>>>> > at
>>>>>>>> >
>>>>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>>> > at
>>>>>>>> >
>>>>>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>>> > at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>>> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>>>>>>> > Caused by: java.lang.ClassNotFoundException:com/*****/HAgent
>>>>>>>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>>>>>>>> > at java.security.AccessController.doPrivileged(Native
Method)
>>>>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>>>>>>>> > ... 8 more
>>>>>>>> >
>>>>>>>> > My main class is defined as below :
>>>>>>>> > public class FindSchoolsJob extends Configured implements Tool
{
>>>>>>>> > :
>>>>>>>> > public int run(String[] args) throws Exception {
>>>>>>>> > :
>>>>>>>> > :
>>>>>>>> >               }
>>>>>>>> > :
>>>>>>>> > public static void main(String[] args) throws Exception
{
>>>>>>>> > int res = ToolRunner.run(new Configuration(), new FindSchoolsJob(),
>>>>>>>> > args);
>>>>>>>> > System.exit(res);
>>>>>>>> > }
>>>>>>>> > }
>>>>>>>> > Any hint would be highly appreciated.
>>>>>>>> > Thank You!
>>>>>>>> > ~V
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Todd Lipcon
>>>>>>>> Software Engineer, Cloudera
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards
>>>>>>> Anty Rao
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Todd Lipcon
>>>>>> Software Engineer, Cloudera
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards
>>>>> Anty Rao
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Todd Lipcon
>>>> Software Engineer, Cloudera
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards
>>> Anty Rao
>>>
>>
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>

Mime
View raw message