hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Farhan Husain <farhan.hus...@csebuet.org>
Subject Re: Using external library in MapReduce jobs
Date Fri, 23 Apr 2010 17:01:05 GMT
Hello Mike,

I completely agree with you. I think bundling the libraries in the job jar
file is the correct way to go.

Thanks,
Farhan

On Thu, Apr 22, 2010 at 9:12 PM, Michael Segel <michael_segel@hotmail.com>wrote:

>
>
>
> > Date: Thu, 22 Apr 2010 17:30:13 -0700
> > Subject: Re: Using external library in MapReduce jobs
> > From: alexvk@cloudera.com
> > To: common-user@hadoop.apache.org
> >
> > Sure, you need to place them into $HADOOP_HOME/lib directory on each
> server
> > in the cluster and they will be picked up on the next restart.
> >
> > -- Alex K
> >
>
> While this works, I wouldn't recommend it.
>
> You have to look at it this way... Your external m/r java libs are job
> centric. So every time you want to add jobs that require new external
> libraries you have to 'bounce' your cloud after pushing the the jars. Then
> you also have the issue of java class collisions if the cloud has one
> version of the same jar you're using. (We've had this happen to us already.)
>
> If you're just testing for a proof of concept, its one thing, but after the
> proof, you'll need to determine how to correctly push the jars out to each
> node.
>
> In a production environment, constantly bouncing clouds for each new job
> isn't really a good idea.
>
> HTH
>
> -Mike
>
> _________________________________________________________________
> Hotmail has tools for the New Busy. Search, chat and e-mail from your
> inbox.
>
> http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_1
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message