hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Kubes <ku...@apache.org>
Subject Re: External jars revisited.
Date Wed, 10 Oct 2007 12:59:16 GMT
Ok, patch is submitted.  You can find it at 

Just download and apply to the current 0.15 Hadoop source code.  Then in 
your hadoop-site.xml file you will want to add the following 
configuration variable and change according to your resources:

   Any resources to be included in the mapreduce job.  Resources can be an
   absolute path to a jar file, the name of a jar file on the classpath, 
   name of a class that is contained in jar file on the classpath, or an
   absolute path to a directory.  Resource values are comma separated.  All
   resources will be merged into the job.jar before the job is submitted.

I already have this running on a development cluster so if you need help 
getting it up and running, shoot me an email and we can figure it out 
through email or IM.

Dennis Kubes

Daniel Wressle wrote:
> Sorry to spam you people to death, but going through the logs I noticed 
> that I can instantiate the external API class without any problem. The 
> NullPointerException is thrown when I try to call a *function* of that 
> instance. So, the classes are recognized, in some way, by Hadoop.
> /D
> Daniel Wressle wrote:
>> Update,
>> desperate to find a solution I've placed my dependencies on both 
>> computers (in the hadoopxx-x/lib directory), everywhere in my job jar 
>> file and pretty much everywhere I can think of that might rectify my 
>> problem :).
>> I am still plagued by the NullPointerException (unknown source) on my 
>> slave machine. The same jar file maps and reduces without any errors 
>> on my main machine (which is running the jobtracker and dfs-master).
>> Somehow I feel I can't be the only person in the world who has used 
>> external API's (as jar files) with Hadoop on more than one computer? :)
>> Regards,
>> Daniel
>>                                         System.out.println("Floodar på 
>> " + fillX + "," + fillY);
>> Daniel Wressle wrote:
>>> Hello Dennis, Ted and Christophe.
>>> I had, as a precaution, built my jar so that the lib/ directory 
>>> contained both the actual jars AND the jars unzipped, i.e:
>>> lib/:
>>> foo.jar
>>> foo2.jar
>>> foo3.jar
>>> foo1/api/bar/gazonk/foo.class
>>> foo2/....
>>> Just to cover as much ground as possible.
>>> I do include a class in constructor call to the JobConf, but it is 
>>> the name of *my* class (in this case, LogParser.class). Glancing at 
>>> the constructor in the API I can not seem to use this constructor to 
>>> tell the JobConf which additional classes to expect.
>>> Dennis, your patch sounds _very_ interesting. How would I go about 
>>> acquiring it when it is done? :)
>>> Thank you for your time and responses.
>>> /Daniel
>>> Dennis Kubes wrote:
>>>> This won't solve your current error, but I should have a revised 
>>>> patch for HADOOP-1622, which deals which allows multiple resources 
>>>> including jars for hadoop jobs finished and posted this afternoon.
>>>> Dennis Kubes
>>>> Ted Dunning wrote:
>>>>> Can you show us the lines in your code where you construct the 
>>>>> JobConf?
>>>>> If you don't include a class in that constructor call, then Hadoop 
>>>>> doesn't
>>>>> have enough of a hint to find your jar files.
>>>>> On 10/8/07 12:03 PM, "Christophe Taton" 
>>>>> <christophe.taton@gmail.com> wrote:
>>>>>> Hi Daniel,
>>>>>> Can you try to build and run a single jar file which contains all
>>>>>> required class files directly (i.e. without including jar files 
>>>>>> inside
>>>>>> the job jar file)?
>>>>>> This should prevent classloading problems. If the error still 
>>>>>> persists,
>>>>>> then you might suspect other problems.
>>>>>> Chris
>>>>>> Daniel Wressle wrote:
>>>>>>> Hello Hadoopers!
>>>>>>> I have just recently started using Hadoop and I have a question
>>>>>>> has puzzled me for a couple of days now.
>>>>>>> I have already browsed the mailing list and found some relevant

>>>>>>> posts,
>>>>>>> especially
>>>>>>> http://mail-archives.apache.org/mod_mbox/lucene-hadoop-user/200708.mbox/%3c84

>>>>>>> ad79bb0708131649x3b94cc18x7a0910090f06a1e7@mail.gmail.com%3e,
>>>>>>> but the solution eludes me.
>>>>>>> My Map/Reduce job relies on external jars and I had to modify
my ant
>>>>>>> script to include them in the lib/ directory of my jar file.
So far,
>>>>>>> so good. The job runs without any issues when I issue the job
on my
>>>>>>> local machine only.
>>>>>>> However, adding a second machine to the mini-cluster presents
>>>>>>> following problem: a NullPointerException being thrown as soon
as I
>>>>>>> call any function within a class I have imported from the external
>>>>>>> jars. Please note that this will only happen on the other 
>>>>>>> machine, the
>>>>>>> maps on my main machine, which I submit the job on, will proceed
>>>>>>> without any warnings.
>>>>>>> java.lang.NullPointerException at xxx.xxx.xxx (Unknown Source)
>>>>>>> the
>>>>>>> actual log output from hadoop.
>>>>>>> My jar file contains all the necessary jars in the lib/ 
>>>>>>> directory. Do
>>>>>>> I need to place them somewhere else on the slaves in order for
>>>>>>> submitted job to be able to use them?
>>>>>>> Any pointers would be much appreciated.

View raw message