hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eyal Golan <egola...@gmail.com>
Subject Re: Is it possible to user hadoop archive to specify third party libs
Date Sat, 07 Jan 2012 15:53:50 GMT
yes :)

thanks.


Eyal Golan
egolan74@gmail.com

Visit: http://jvdrums.sourceforge.net/
LinkedIn: http://www.linkedin.com/in/egolan74
Skype: egolan74

P  Save a tree. Please don't print this e-mail unless it's really necessary



On Sat, Jan 7, 2012 at 2:58 PM, Bejoy Ks <bejoy.hadoop@gmail.com> wrote:

> Eyal
>       Hope you are looking for this one
>
> http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/
>
> Regards
> Bejoy.K.S
>
>
>
> On Sat, Jan 7, 2012 at 12:25 PM, Eyal Golan <egolan74@gmail.com> wrote:
>
>> hi,
>> can you please point out link to Cloudera's article?
>>
>> thanks,
>>
>> Eyal
>>
>>
>> Eyal Golan
>> egolan74@gmail.com
>>
>> Visit: http://jvdrums.sourceforge.net/
>> LinkedIn: http://www.linkedin.com/in/egolan74
>> Skype: egolan74
>>
>> P  Save a tree. Please don't print this e-mail unless it's really
>> necessary
>>
>>
>>
>> On Tue, Jan 3, 2012 at 5:28 PM, Samir Eljazovic <
>> samir.eljazovic@gmail.com> wrote:
>>
>>> Hi,
>>> yes, I'm trying to get option 1 from Cloudera's article (using
>>> distributed cache) work. If I specify all libraries when running the job it
>>> works, but I'm trying to make it work using only one archive file
>>> containing all native libraries I need. And that seems to be a problem.
>>>
>>> when I use tar file libraries are extracted but they are not added to
>>> classpath.
>>>
>>> Here's TT log:
>>>
>>> 2012-01-03 15:04:43,611 INFO
>>> org.apache.hadoop.filecache.TrackerDistributedCacheManager (Thread-447):
>>> Creating openCV.tar in
>>> /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/distcache/8087259939901130551_1003999143_605667452/
>>> 10.190.207.247/mnt/var/lib/hadoop/tmp/mapred/staging/hadoop/.staging/job_201201031358_0008/archives/openCV.tar-work--7133799918421346652with
rwxr-xr-x
>>> 2012-01-03 15:04:44,209 INFO
>>> org.apache.hadoop.filecache.TrackerDistributedCacheManager (Thread-447):
>>> Extracting
>>> /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/distcache/8087259939901130551_1003999143_605667452/
>>> 10.190.207.247/mnt/var/lib/hadoop/tmp/mapred/staging/hadoop/.staging/job_201201031358_0008/archives/openCV.tar-work--7133799918421346652/openCV.tarto
>>> /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/distcache/8087259939901130551_1003999143_605667452/
>>> 10.190.207.247/mnt/var/lib/hadoop/tmp/mapred/staging/hadoop/.staging/job_201201031358_0008/archives/openCV.tar-work--7133799918421346652
>>> 2012-01-03 15:04:44,363 INFO
>>> org.apache.hadoop.filecache.TrackerDistributedCacheManager (Thread-447):
>>> Cached hdfs://
>>> 10.190.207.247:9000/mnt/var/lib/hadoop/tmp/mapred/staging/hadoop/.staging/job_201201031358_0008/archives/openCV.tar#openCV.taras
>>> /mnt/var/lib/hadoop/mapred/taskTracker/hadoop/distcache/8087259939901130551_1003999143_605667452/
>>> 10.190.207.247/mnt/var/lib/hadoop/tmp/mapred/staging/hadoop/.staging/job_201201031358_0008/archives/openCV.tar
>>>
>>> What should I do to get these libs available to my job?
>>>
>>> Thanks
>>>
>>>
>>> On 3 January 2012 15:57, Praveen Sripati <praveensripati@gmail.com>wrote:
>>>
>>>> Check this article from Cloudera for different options.
>>>>
>>>>
>>>> http://www.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/
>>>>
>>>> Praveen
>>>>
>>>> On Tue, Jan 3, 2012 at 7:41 AM, Harsh J <harsh@cloudera.com> wrote:
>>>>
>>>>> Samir,
>>>>>
>>>>> I believe HARs won't work there. But you can use a regular tar
>>>>> instead, and that should be unpacked properly.
>>>>>
>>>>> On 03-Jan-2012, at 5:38 AM, Samir Eljazovic wrote:
>>>>>
>>>>> > Hi,
>>>>> > I need to provide a lot of 3th party libraries (both java and
>>>>> native) and doing that using generic option parser (-libjars and -files
>>>>> arguments) is a little bit messy. I was wandering if it is possible to
wrap
>>>>> all libraries into single har archive and use that when submitting the
job?
>>>>> >
>>>>> > Just to mention that I want to avoid putting all libraries into
job
>>>>> jar for two reasons:
>>>>> > 1. does not work for  native libs
>>>>> > 2. takes time to upload jar
>>>>> >
>>>>> > Thanks,
>>>>> > Samir
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message