hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Kubes <ku...@apache.org>
Subject Re: External jars revisited.
Date Wed, 10 Oct 2007 13:04:11 GMT

Daniel Wressle wrote:
> Hello Dennis, Ted and Christophe.
> I had, as a precaution, built my jar so that the lib/ directory 
> contained both the actual jars AND the jars unzipped, i.e:
> lib/:
> foo.jar
> foo2.jar
> foo3.jar
> foo1/api/bar/gazonk/foo.class
> foo2/....
> Just to cover as much ground as possible.
> I do include a class in constructor call to the JobConf, but it is the 
> name of *my* class (in this case, LogParser.class). Glancing at the 
> constructor in the API I can not seem to use this constructor to tell 
> the JobConf which additional classes to expect.

By using a class in the jobconf you are essentially telling hadoop that 
the jar which contains your class is the hadoop job jar.  But if I am 
understanding you, your job jar doesn't contain the other jars, instead 
you put them in the lib directory of hadoop?

I haven't done it this way but from what I remember you would need to 
unzip your jar, create a lib directory in your jar, and then put and 
referenced third party jars in that lib directory to be included in your 
jar, then zip it back up and deploy as a single jar.

Of course the new patch does away with the need to do this :)

Dennis Kubes
> Dennis, your patch sounds _very_ interesting. How would I go about 
> acquiring it when it is done? :)
> Thank you for your time and responses.
> /Daniel
> Dennis Kubes wrote:
>> This won't solve your current error, but I should have a revised patch 
>> for HADOOP-1622, which deals which allows multiple resources including 
>> jars for hadoop jobs finished and posted this afternoon.
>> Dennis Kubes
>> Ted Dunning wrote:
>>> Can you show us the lines in your code where you construct the JobConf?
>>> If you don't include a class in that constructor call, then Hadoop 
>>> doesn't
>>> have enough of a hint to find your jar files.
>>> On 10/8/07 12:03 PM, "Christophe Taton" <christophe.taton@gmail.com> 
>>> wrote:
>>>> Hi Daniel,
>>>> Can you try to build and run a single jar file which contains all
>>>> required class files directly (i.e. without including jar files inside
>>>> the job jar file)?
>>>> This should prevent classloading problems. If the error still persists,
>>>> then you might suspect other problems.
>>>> Chris
>>>> Daniel Wressle wrote:
>>>>> Hello Hadoopers!
>>>>> I have just recently started using Hadoop and I have a question that
>>>>> has puzzled me for a couple of days now.
>>>>> I have already browsed the mailing list and found some relevant posts,
>>>>> especially
>>>>> http://mail-archives.apache.org/mod_mbox/lucene-hadoop-user/200708.mbox/%3c84

>>>>> ad79bb0708131649x3b94cc18x7a0910090f06a1e7@mail.gmail.com%3e,
>>>>> but the solution eludes me.
>>>>> My Map/Reduce job relies on external jars and I had to modify my ant
>>>>> script to include them in the lib/ directory of my jar file. So far,
>>>>> so good. The job runs without any issues when I issue the job on my
>>>>> local machine only.
>>>>> However, adding a second machine to the mini-cluster presents the
>>>>> following problem: a NullPointerException being thrown as soon as I
>>>>> call any function within a class I have imported from the external
>>>>> jars. Please note that this will only happen on the other machine, the
>>>>> maps on my main machine, which I submit the job on, will proceed
>>>>> without any warnings.
>>>>> java.lang.NullPointerException at xxx.xxx.xxx (Unknown Source) is the
>>>>> actual log output from hadoop.
>>>>> My jar file contains all the necessary jars in the lib/ directory. Do
>>>>> I need to place them somewhere else on the slaves in order for my
>>>>> submitted job to be able to use them?
>>>>> Any pointers would be much appreciated.

View raw message