hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henning Blohm <henning.bl...@zfabrik.de>
Subject Re: Too large class path for map reduce jobs
Date Fri, 17 Sep 2010 18:53:40 GMT
Not really. "Anything in Hadoop" was really meant to say just that. 

The way we want to run tasks is with integrated provisioning of
everything needed (using www.z2-environment.eu ). So effectively a
Hadoop task loads the provisioning capability in process and then runs
the actual task implementation as provisioned (from another repository
actually), so that we do not need to have a special build process for
Hadoop Jobs.

However, everything on the class path of the hadoop task is visible to
the code of the z2 system and the task implementation and may lead to
conflict with other code. Specifically the Java compiler implementation
that is on the Hadoop class path (due to the use of Jasper) conflicts
with the one we use. That's why we would like to run Hadoop tasks
without unnecessary stuff (e.g. Jasper) on the class path.


Am Freitag, den 17.09.2010, 16:01 +0000 schrieb Allen Wittenauer:

> On Sep 17, 2010, at 4:56 AM, Henning Blohm wrote:
> > When running map reduce tasks in Hadoop I run into classpath issues. Contrary to
previous posts, my problem is not that I am missing classes on the Task's class path (we have
a perfect solution for that) but rather find too many (e.g. ECJ classes or jetty).
> The fact that you mention:
> > The libs in HADOOP_HOME/lib seem to contain everything needed to run anything in
Hadoop which is, I assume, much more than is needed to run a map reduce task.
> hints that your perfect solution is to throw all your custom stuff in lib.  If so, that's
a huge mistake.  Use distributed cache instead.

View raw message