Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6FEB585A6 for ; Thu, 15 Sep 2011 11:45:09 +0000 (UTC) Received: (qmail 16990 invoked by uid 500); 15 Sep 2011 11:45:08 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 16934 invoked by uid 500); 15 Sep 2011 11:45:08 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 16925 invoked by uid 99); 15 Sep 2011 11:45:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Sep 2011 11:45:08 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of joey@cloudera.com designates 209.85.161.48 as permitted sender) Received: from [209.85.161.48] (HELO mail-fx0-f48.google.com) (209.85.161.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Sep 2011 11:45:03 +0000 Received: by fxd23 with SMTP id 23so754516fxd.35 for ; Thu, 15 Sep 2011 04:44:42 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.30.27 with SMTP id s27mr909655fac.129.1316087082097; Thu, 15 Sep 2011 04:44:42 -0700 (PDT) Received: by 10.223.83.8 with HTTP; Thu, 15 Sep 2011 04:44:42 -0700 (PDT) In-Reply-To: References: Date: Thu, 15 Sep 2011 07:44:42 -0400 Message-ID: Subject: Re: -libjars? From: Joey Echeverria To: mapreduce-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Ok, but does the job even start the maps, or does it fail during initial se= tup? The reason I ask is libjars only adds the jar to the classpath for the mappers and reducers. If you need the class before the job is submitted to the cluster, you should do something like this: HADOOP_CLASSPATH=3D../umd-hadoop-core/cloud9.jar hadoop jar myjob.jar myjob.driver.PreprocessANC -libjars ../umd-hadoop-core/cloud9.jar home/my/pyworkspace/openAnc.xml index/ 10 1 -Joey On Thu, Sep 15, 2011 at 4:24 AM, Marco Didonna wrot= e: > Right now I am still in standalone mode ... I'd like to fix this issue > before starting a cluster on EC2. :) > > Thanks for your time > > Marco > > On 14 September 2011 14:04, Joey Echeverria wrote: >> When are you getting the exception? Is it during the setup of your >> job, or after it's running on the cluster? >> >> -Joey >> >> On Wed, Sep 14, 2011 at 4:50 AM, Marco Didonna w= rote: >>> Hello everyone, >>> sorry to bring this up again but I need some clarification. I wrote a >>> map-reduce application that need cloud9 library >>> (https://github.com/lintool/Cloud9). This library is packet in a jar >>> file and I want to make it available to the whole cluster. So far I >>> have been working in standalone mode and I have unsuccessfully tried >>> to use the -libjars options. I always get ClassNotDefException: the >>> only way I made everything work fine is by copying the cloud9.jar into >>> hadoop/lib folder. >>> I suppose I cannot do it when using a cluster of N machines since I >>> would have to copy it on the N machines and this approach isn't >>> feasible. >>> >>> Here's how I perform the job "hadoop jar myjob.jar >>> myjob.driver.PreprocessANC -libjars ../umd-hadoop-core/cloud9.jar >>> home/my/pyworkspace/openAnc.xml index/ 10 1" >>> >>> Is there some code that needs to be written in the driver in order to >>> have the darn library added to the "global" classpath? This -libjars >>> option is really poor documented IMHO. >>> >>> Any help would be very much appreciated ;) >>> >>> Marco Didonna >>> >>> On 17 August 2011 03:57, Anty wrote: >>>> Thanks very much , todd. I get it. >>>> >>>> >>>> On Wed, Aug 17, 2011 at 6:23 AM, Todd Lipcon wrote= : >>>>> Putting files on the classpath doesn't make them accessible to JVM's >>>>> resource loader. If you have dir/foo.properties, then "dir" needs to >>>>> be on the classpath, not "dir/foo.properties". Since the working dir >>>>> of the task is on the classpath, then -files works since it gets the >>>>> properties file into a directory on the classpath. >>>>> >>>>> -Todd >>>>> >>>>> On Mon, Aug 15, 2011 at 8:09 PM, Anty wrote: >>>>>> thanks very much for you reply, todd. >>>>>> I am at a complete loss. I want to ship a configuration file to the >>>>>> cluster to run my mapreduce job. >>>>>> >>>>>> if I use -libjars option to ship the configuration file, the launche= d >>>>>> child JVM created =A0by task tracker >>>>>> =A0can't find the configuration file,curiously, the configuration fi= le >>>>>> is already on the classpath of the child JVM. >>>>>> >>>>>> if I use -files option to ship the configuration file, the child JVM >>>>>> can find the file. >>>>>> IMO, what's the difference between -libjars and -files =A0is that -f= iles >>>>>> will create a =A0symbol sink =A0to the configuration file >>>>>> in current workding directory of child JVM. >>>>>> >>>>>> I dig into the source code,but it's so complicated, i can't figure o= ut >>>>>> the root cause of this. >>>>>> So my question is : >>>>>> with -libjars option ,the configuration file is already on the >>>>>> classpath, why classload can't the configuration file , >>>>>> but why JVM classload CAN find the shipped jar with -libjars option? >>>>>> >>>>>> any help will be appreciated. >>>>>> >>>>>> On Tue, Aug 16, 2011 at 1:06 AM, Todd Lipcon wro= te: >>>>>>> Your "driver" is the program that submits the job. The task is the >>>>>>> thing that runs on the cluster. They have separate classpaths. >>>>>>> >>>>>>> Better to ask on the public lists if you want a more indepth explan= ation >>>>>>> >>>>>>> -Todd >>>>>>> >>>>>>> On Mon, Aug 15, 2011 at 9:02 AM, Anty wrote: >>>>>>>> Hi:Todd >>>>>>>> Would you please explain a litter more? >>>>>>>> >>>>>>>> On Sat, Dec 11, 2010 at 2:08 AM, Todd Lipcon w= rote: >>>>>>>>> >>>>>>>>> You need to put the library jar on your classpath (eg using >>>>>>>>> HADOOP_CLASSPATH) as well. The -libjars will ship it to the clust= er >>>>>>>>> and put it on the classpath of your task, but not the classpath o= f >>>>>>>>> your "driver" code. >>>>>>>>> >>>>>>>> I still can't understand you mean by=A0 " but not the classpath of >>>>>>>> your "driver" code." >>>>>>>> >>>>>>>> THX advance. >>>>>>>> >>>>>>>> >>>>>>>>> -Todd >>>>>>>>> >>>>>>>>> On Thu, Dec 9, 2010 at 10:29 PM, Vipul Pandey wrote: >>>>>>>>> > disclaimer : a newbie!!! >>>>>>>>> > Howdy? >>>>>>>>> > Got a quick question. -libjars option doesn't seem to work for = me in - >>>>>>>>> > prettymuch - my first (or mayby second) mapreduce job. >>>>>>>>> > Here's what i'm doing : >>>>>>>>> > $bin/hadoop jar =A0sherlock.jar somepkg.FindSchoolsJob -libjars >>>>>>>>> > =A0HStats-1A18.jar input output >>>>>>>>> > >>>>>>>>> > sherlock.jar has my main class (ofcourse) =A0FindSchoolsJob, wh= ich runs >>>>>>>>> > just >>>>>>>>> > fine by itself till I add a dependency on a class in=A0HStats-1= A18.jar. >>>>>>>>> > When I run the above command with -libjars specified - it fails= to find >>>>>>>>> > my >>>>>>>>> > classes that 'are' inside HStats jar file. >>>>>>>>> > Exception in thread "main" java.lang.NoClassDefFoundError: >>>>>>>>> > com/*****/HAgent >>>>>>>>> > at com.*****.FindSchoolsJob.run(FindSchoolsJob.java:46) >>>>>>>>> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>>>>>>>> > at com.******.FindSchoolsJob.main(FindSchoolsJob.java:101) >>>>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>>>>> > at >>>>>>>>> > >>>>>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccesso= rImpl.java:39) >>>>>>>>> > at >>>>>>>>> > >>>>>>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMetho= dAccessorImpl.java:25) >>>>>>>>> > at java.lang.reflect.Method.invoke(Method.java:597) >>>>>>>>> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>>>>>>>> > Caused by: java.lang.ClassNotFoundException:com/*****/HAgent >>>>>>>>> > at java.net.URLClassLoader$1.run(URLClassLoader.java:202) >>>>>>>>> > at java.security.AccessController.doPrivileged(Native Method) >>>>>>>>> > at java.net.URLClassLoader.findClass(URLClassLoader.java:190) >>>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:307) >>>>>>>>> > at java.lang.ClassLoader.loadClass(ClassLoader.java:248) >>>>>>>>> > ... 8 more >>>>>>>>> > >>>>>>>>> > My main class is defined as below : >>>>>>>>> > public=A0class=A0FindSchoolsJob=A0extends=A0Configured=A0implem= ents=A0Tool { >>>>>>>>> > : >>>>>>>>> > public=A0int=A0run(String[] args)=A0throws=A0Exception { >>>>>>>>> > : >>>>>>>>> > : >>>>>>>>> > =A0=A0 =A0 =A0 =A0 =A0 =A0 =A0} >>>>>>>>> > : >>>>>>>>> > public=A0static=A0void=A0main(String[] args)=A0throws=A0Excepti= on { >>>>>>>>> > int=A0res =3D ToolRunner.run(new=A0Configuration(),=A0new=A0Fin= dSchoolsJob(), >>>>>>>>> > args); >>>>>>>>> > System.exit(res); >>>>>>>>> > } >>>>>>>>> > } >>>>>>>>> > Any hint would be highly appreciated. >>>>>>>>> > Thank You! >>>>>>>>> > ~V >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Todd Lipcon >>>>>>>>> Software Engineer, Cloudera >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Best Regards >>>>>>>> Anty Rao >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Todd Lipcon >>>>>>> Software Engineer, Cloudera >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards >>>>>> Anty Rao >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Todd Lipcon >>>>> Software Engineer, Cloudera >>>>> >>>> >>>> >>>> >>>> -- >>>> Best Regards >>>> Anty Rao >>>> >>> >> >> >> >> -- >> Joseph Echeverria >> Cloudera, Inc. >> 443.305.9434 >> > --=20 Joseph Echeverria Cloudera, Inc. 443.305.9434