hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Caspar MacRae <ear...@gmail.com>
Subject Re: Custom Class Loader for Hadoop M/R jobs?
Date Thu, 15 Apr 2010 15:16:49 GMT
Hi all,

I think the problem(s) lies deeper and should be solved at more fundamental
level: OSGi (http://www.osgi.org)

This has the *classloader*, *modularity* and *distribution* management
maturity that IMHO Hadoop clearly needs (from what I know, albeit circa
1.9).

It's 10 years old, not headline app-server-tastic, nor flavour of the month
http://java.dzone.com/articles/osgi-feast-or-famine - but that's the point,
this proven tech.  And it'll eventually be present at the lowest levels of
Java, eg. project jigsaw: http://openjdk.java.net/projects/jigsaw/ which is
a precursor to completing JSR-291

<http://markmail.org/message/5fjx7pzq6kmwagch> A number of people have tried
to introduce OSGi to Hadoop but it seems their efforts *may* have been
ignored by those in the meritocratic circle of power - this is a real shame,
perhaps Jira voting is the way to draw attention to this?  You could vote on
this ticket that's over 2 years old...
https://issues.apache.org/jira/browse/MAPREDUCE-243

OSGi can easily solve the classloading and lifecycle management issues in
Hadoop, and brings a lot more besides.  Can someone please explain to me the
rationale for continuing to ignore such an obvious and elegant solution?


Best regards,
Caspar
<http://techdistrict.kirkk.com/2010/02/26/osgi-devcon-slides/>


On 14 April 2010 19:19, Cooper, Chris <chris.cooper@navteq.com> wrote:

> Scott,
>
> I think the direction your comments in
> https://issues.apache.org/jira/browse/MAPREDUCE-1700 is spot on.  You
> should be looking at J2EE container class loader hierarchies.  I've attached
> a couple of good links that cover this approach.
>
>
>
> http://www.ibm.com/developerworks/websphere/library/techarticles/0112_deboer/deboer.html
> http://www.objectsource.com/j2eechapters/Ch21-ClassLoaders_and_J2EE.htm
>
> I'm sure Mike and I would both be willing to work with you to contribute a
> solution if you're interested.
>
> Best regards,
>
> CC
>
> -----Original Message-----
> From: Scott Carey [mailto:scott@richrelevance.com]
> Sent: Wednesday, April 14, 2010 1:08 PM
> To: general@hadoop.apache.org
> Subject: Re: Custom Class Loader for Hadoop M/R jobs?
>
> My long term suggestions are in
> https://issues.apache.org/jira/browse/MAPREDUCE-1700.  The framework
> definitely needs to handle this and not place the burden on users, IMO. But
> that won't help you in the short term.
>
> Whether removing or replacing a Hadoop jar is an acceptable option to you
> (or others) in the short term is up to you.  Obviously, its not a great long
> term solution but if you (or someone else) has to make it work ASAP, it
> might be the only option.  In our case, we package our own rpm and have a
> few custom patches to Hadoop so removing one jar is a trivial thing to do in
> the short / medium term.
>
> -Scott
>
> On Apr 14, 2010, at 10:33 AM, Segel, Mike wrote:
>
> > Scott,
> >
> > While that may work for a quick fix. Its not a good long term solution
> and you then run in to a problem where you upgrade your hadoop release and
> the removed jar is replaced or if you replace the jar, it possible to get
> overwritten.
> >
> > In this specific instance, the Jackson libraries are not that important
> and they can be replaced.
> > But that doesn't mean that this issue won't come up again and its
> something you can't easily pop out and replace.
> >
> > This is why I'm looking at custom class loading and trying to understand
> what can be accomplished with the methods in the Configuration class.
> >
> > Thx
> >
> > -Mike
> >
> >
> > -----Original Message-----
> > From: Scott Carey [mailto:scott@richrelevance.com]
> > Sent: Wednesday, April 14, 2010 12:02 PM
> > To: general@hadoop.apache.org
> > Subject: Re: Custom Class Loader for Hadoop M/R jobs?
> >
> > Depending on what the dependency is, you might be able to just remove it
> from hadoop's lib directory on your cluster.
> >
> > For me, Hadoop's later versions has jackson-1.0.1 in its lib directory
> and that breaks usage of Avro in a M/R job among other things.  However, the
> feature that uses this library is unimportant to me (configuration dump in
> JSON format) so I just removed the jar.
> >
> > -Scott
> >
> > On Apr 14, 2010, at 6:39 AM, Segel, Mike wrote:
> >
> >> Hi,
> >>
> >> Ok, here's a bit of a bizarre  issue...
> >>
> >> How do you handle class collisions between Hadoop and your m/r job which
> calls other 3rd party classes.
> >>
> >> An example: Hadoop has an older version of an open source jar in its
> /lib directory. You're interfacing with a 3rd party OS tool that uses a
> later release of the same jar.
> >>
> >> You can modify the classpath, and that might work. But the better way is
> to create a Custom Class Loader. (Non-trivial)
> >>
> >> Looking at the Configuration class, it looks like there are a couple of
> methods that deal with loading a class in to the configuration so that the
> m/r jobs can have access to them on each node.
> >>
> >> Is this the correct intended use, or am I missing something?
> >> Has anyone done something like this?
> >>
> >> Thx
> >>
> >> -Mike
> >>
> >> Michael Segel
> >> Architect,  R&D
> >> NAVTEQ
> >> 425 West Randolph Street
> >> Chicago, IL 60606
> >> (T)  +1 312-780-3432
> >> (C)  +1 312-952-8175
> >> www.navteq.com<http://www.navteq.com/>
> >>
> >>
> >>
> >> The information contained in this communication may be CONFIDENTIAL and
> is intended only for the use of the recipient(s) named above.  If you are
> not the intended recipient, you are hereby notified that any dissemination,
> distribution, or copying of this communication, or any of its contents, is
> strictly prohibited.  If you have received this communication in error,
> please notify the sender and delete/destroy the original message and any
> copy of it from your computer or paper files.
> >
> >
> >
> > The information contained in this communication may be CONFIDENTIAL and
> is intended only for the use of the recipient(s) named above.  If you are
> not the intended recipient, you are hereby notified that any dissemination,
> distribution, or copying of this communication, or any of its contents, is
> strictly prohibited.  If you have received this communication in error,
> please notify the sender and delete/destroy the original message and any
> copy of it from your computer or paper files.
>
>
>
> The information contained in this communication may be CONFIDENTIAL and is
> intended only for the use of the recipient(s) named above.  If you are not
> the intended recipient, you are hereby notified that any dissemination,
> distribution, or copying of this communication, or any of its contents, is
> strictly prohibited.  If you have received this communication in error,
> please notify the sender and delete/destroy the original message and any
> copy of it from your computer or paper files.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message