Return-Path: Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: (qmail 78186 invoked from network); 8 Oct 2010 07:53:29 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Oct 2010 07:53:29 -0000 Received: (qmail 60611 invoked by uid 500); 8 Oct 2010 07:53:28 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 60562 invoked by uid 500); 8 Oct 2010 07:53:26 -0000 Mailing-List: contact mapreduce-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-user@hadoop.apache.org Delivered-To: mailing list mapreduce-user@hadoop.apache.org Received: (qmail 60554 invoked by uid 99); 8 Oct 2010 07:53:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Oct 2010 07:53:26 +0000 X-ASF-Spam-Status: No, hits=2.9 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.215.48] (HELO mail-ew0-f48.google.com) (209.85.215.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Oct 2010 07:53:18 +0000 Received: by ewy19 with SMTP id 19so85344ewy.35 for ; Fri, 08 Oct 2010 00:52:57 -0700 (PDT) Received: by 10.213.28.208 with SMTP id n16mr1135372ebc.75.1286524377760; Fri, 08 Oct 2010 00:52:57 -0700 (PDT) Received: from [192.168.0.105] (dslb-092-074-237-068.pools.arcor-ip.net [92.74.237.68]) by mx.google.com with ESMTPS id u3sm1068688eeh.19.2010.10.08.00.52.55 (version=SSLv3 cipher=RC4-MD5); Fri, 08 Oct 2010 00:52:56 -0700 (PDT) Subject: Re: Too large class path for map reduce jobs From: Henning Blohm To: mapreduce-user@hadoop.apache.org In-Reply-To: References: <1285324881.20975.29.camel@expat> <1286359035.7134.36.camel@linux-elo4.site> <1286366254.16009.15.camel@expat> <1286437388.7086.9.camel@linux-elo4.site> Content-Type: multipart/alternative; boundary="=-X8X8waDbPtbSSllu+b0I" Date: Fri, 08 Oct 2010 09:52:53 +0200 Message-ID: <1286524373.9216.4.camel@linux-elo4.site> Mime-Version: 1.0 X-Mailer: Evolution 2.28.2 X-Virus-Checked: Checked by ClamAV on apache.org --=-X8X8waDbPtbSSllu+b0I Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Ahh... that could indeed be the case. Yes, my issue was about "large" rather than "long". Thanks for clarifying! Henning On Thu, 2010-10-07 at 13:27 -0700, Tom White wrote: > I wonder if there is a misunderstanding here - the problem is that the > classpath has too many classes on it (and clashes with user classes), > rather than it being a text string which is too long. > > I would suggest that the technical discussion of how to fix this goes > onto the JIRA. > > Cheers, > Tom > > On Thu, Oct 7, 2010 at 1:23 AM, Alejandro Abdelnur wrote: > > well, if the issue is a too long classpath, the softlink thingy will give > > some room to breath as the total CP length will be much smaller. > > > > A > > On Thu, Oct 7, 2010 at 3:43 PM, Henning Blohm > > wrote: > >> > >> So that's actually another issue, right? Besides splitting the classpath > >> into those three groups, you want the TT to create soft-links on demand to > >> simplify the computation of classpath string. Is that right? > >> > >> But it's the TT that actually starts the job VM. Why does it matter what > >> the string actually looks like, as long as it has the right content? > >> > >> Thanks, > >> Henning > >> > >> On Thu, 2010-10-07 at 13:22 +0800, Alejandro Abdelnur wrote: > >> > >> [sent too soon] > >> > >> The first CP shown is how it is today the CP of a task. If we change it > >> pick up all the job JARs from the current dir, then the classpath will be > >> much shorter (second CP shown). We can easily achieve this by soft-linking > >> the job JARs in the work dir of the task. > >> > >> Alejandro > >> > >> On Thu, Oct 7, 2010 at 1:02 PM, Alejandro Abdelnur > >> wrote: > >> > >> Fragmentation of Hadoop classpaths is another issue: hadoop should > >> differentiate the CP in 3: > >> > >> 1*client CP: what is needed to submit a job (only the nachos) > >> > >> 2*server CP (JT/NN/TT/DD): what is need to run the cluster (the whole > >> enchilada) > >> > >> 3*job CP: what is needed to run a job (some of the enchilada) > >> > >> > >> But i'm not trying to get into that here. What I'm suggesting is: > >> > >> > >> > >> ----- > >> > >> # Hadoop JARs: > >> > >> /Users/tucu/dev-apps/hadoop/conf > >> > >> > >> /System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home/lib/tools.jar > >> > >> /Users/tucu/dev-apps/hadoop/bin/.. > >> > >> /Users/tucu/dev-apps/hadoop/bin/../hadoop-core-0.20.3-CDH3-SNAPSHOT.jar > >> > >> /Users/tucu/dev-apps/hadoop/bin/../lib/aspectjrt-1.6.5.jar > >> > >> ..... (about 30 jars from hadoop lib/ ) > >> > >> /Users/tucu/dev-apps/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar > >> > >> # Job JARs (for a job with only 2 JARs): > >> > >> > >> /Users/tucu/dev-apps/hadoop/dirs/mapred/taskTracker/distcache/-2707763075630339038_639898034_1993697040/localhost/user/tucu/oozie-tucu/0000003-101004184132247-oozie-tucu-W/java-node--java/java-launcher.jar > >> > >> > >> /Users/tucu/dev-apps/hadoop/dirs/mapred/taskTracker/distcache/3613772770922728555_-588832047_1993624983/localhost/user/tucu/examples/apps/java-main/lib/oozie-examples-2.2.1-CDH3B3-SNAPSHOT.jar > >> > >> > >> /Users/tucu/dev-apps/hadoop/dirs/mapred/taskTracker/tucu/jobcache/job_201010041326_0058/attempt_201010041326_0058_m_000000_0/work > >> > >> ----- > >> > >> > >> > >> What I'm suggesting is that the later group, the job JARs to be > >> soft-linked (by the TT) into the working directory, then their classpath is > >> just: > >> > >> ----- > >> > >> java-launcher.jar > >> > >> oozie-examples-2.2.1-CDH3B3-SNAPSHOT.jar > >> > >> . > >> > >> ----- > >> > >> > >> > >> > >> Alejandro > >> > >> On Wed, Oct 6, 2010 at 7:57 PM, Henning Blohm > >> wrote: > >> > >> Hi Alejandro, > >> > >> yes, it can of course be done right (sorry if my wording seemed to > >> imply otherwise). Just saying that I think that Hadoop M/R should not go > >> into that class loader / module separation business. It's one Job, one VM, > >> right? So the problem is to assign just the stuff needed to let the Job do > >> its business without becoming an obstacle. > >> > >> Must admit I didn't understand your proposal 2. How would that remove > >> (e.g.) jetty libs from the job's classpath? > >> > >> Thanks, > >> Henning > >> > >> Am Mittwoch, den 06.10.2010, 18:28 +0800 schrieb Alejandro Abdelnur: > >> > >> 1. Classloader business can be done right. Actually it could be done as > >> spec-ed for servlet web-apps. > >> > >> > >> 2. If the issue is strictly 'too large classpath', then a simpler solution > >> would be to sof-link all JARs to the current directory and create the > >> classpath with the JAR names only (no path). Note that the soft-linking > >> business is already supported by the DistributedCache. So the changes would > >> be mostly in the TT to create the JAR names only classpath before starting > >> the child. > >> > >> > >> Alejandro > >> > >> > >> On Wed, Oct 6, 2010 at 5:57 PM, Henning Blohm > >> wrote: > >> > >> Hi Tom, > >> > >> that's exactly it. Thanks! I don't think that I can comment on the > >> issues in Jira so I will do it here. > >> > >> Tricking with class paths and deviating from the default class loading > >> delegation has never been anything but a short term relieve. Fixing things > >> by imposing a "better" order of stuff on the class path will not work when > >> people do actually use child loaders (as the parent win) - like we do. Also > >> it may easily lead to very confusing situations because the former part of > >> the class path is not complete and gets other stuff from a latter part etc. > >> etc.... no good. > >> > >> Child loaders are good for module separation but should not be used to > >> "hide" type visibiliy from the parent. Almost certainly leading to Class > >> Loader Contraint Violation - once you lose control (which is usually earlier > >> than expected). > >> > >> The suggestion to reduce the Job class path to the required minimum is > >> the most practical approach. There is some gray area there of course and it > >> will not be feasible to reach the absolute minimal set of types there - but > >> something reasonable, i.e. the hadoop core that suffices to run the job. > >> Certainly jetty & co are not required for job execution (btw. I "hacked" > >> 0.20.2 to remove anything in "server/" from the classpath before setting the > >> job class path). > >> > >> I would suggest to > >> > >> a) introduce some HADOOP_JOB_CLASSPATH var that, if set, is the > >> additional classpath, added to the "core" classpath (as described above). If > >> not set, for compatibility, preserve today's behavior. > >> b) not getting into custom child loaders for jobs as part of hadoop M/R. > >> It's non-trivial to get it right and feels to be beyond scope. > >> > >> I wouldn't mind helping btw. > >> > >> Thanks, > >> Henning > >> > >> > >> > >> > >> On Tue, 2010-10-05 at 15:59 -0700, Tom White wrote: > >> > >> Hi Henning, > >> > >> I don't know if you've seen > >> https://issues.apache.org/jira/browse/MAPREDUCE-1938 and > >> https://issues.apache.org/jira/browse/MAPREDUCE-1700 which have > >> discussion about this issue. > >> > >> Cheers > >> Tom > >> > >> On Fri, Sep 24, 2010 at 3:41 AM, Henning Blohm > >> wrote: > >> > Short update on the issue: > >> > > >> > I tried to find a way to separate class path configurations by modifying > >> > the > >> > scripts in HADOOP_HOME/bin but found that TaskRunner actually copies the > >> > class path setting from the parent process when starting a local task so > >> > that I do not see a way of having less on a job's classpath without > >> > modifying Hadoop. > >> > > >> > As that will present a real issue when running our jobs on Hadoop I > >> > would > >> > like to propose to change TaskRunner so that it sets a class path > >> > specifically for M/R tasks. That class path could be defined in the > >> > scipts > >> > (as for the other processes) using a particular environment variable > >> > (e.g. > >> > HADOOP_JOB_CLASSPATH). It could default to the current VM's class path, > >> > preserving today's behavior. > >> > > >> > Is it ok to enter this as an issue? > >> > > >> > Thanks, > >> > Henning > >> > > >> > > >> > Am Freitag, den 17.09.2010, 16:01 +0000 schrieb Allen Wittenauer: > >> > > >> > On Sep 17, 2010, at 4:56 AM, Henning Blohm wrote: > >> > > >> >> When running map reduce tasks in Hadoop I run into classpath issues. > >> >> Contrary to previous posts, my problem is not that I am missing classes > >> >> on > >> >> the Task's class path (we have a perfect solution for that) but rather > >> >> find > >> >> too many (e.g. ECJ classes or jetty). > >> > > >> > The fact that you mention: > >> > > >> >> The libs in HADOOP_HOME/lib seem to contain everything needed to run > >> >> anything in Hadoop which is, I assume, much more than is needed to run > >> >> a map > >> >> reduce task. > >> > > >> > hints that your perfect solution is to throw all your custom stuff in > >> > lib. > >> > If so, that's a huge mistake. Use distributed cache instead. > >> > > >> > >> > >> > >> > >> > >> > >> > >> > >> > > > > --=-X8X8waDbPtbSSllu+b0I Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: 7bit Ahh... that could indeed be the case. Yes, my issue was about "large" rather than "long".

Thanks for clarifying!

Henning

On Thu, 2010-10-07 at 13:27 -0700, Tom White wrote:
I wonder if there is a misunderstanding here - the problem is that the
classpath has too many classes on it (and clashes with user classes),
rather than it being a text string which is too long.

I would suggest that the technical discussion of how to fix this goes
onto the JIRA.

Cheers,
Tom

On Thu, Oct 7, 2010 at 1:23 AM, Alejandro Abdelnur <tucu@cloudera.com> wrote:
> well, if the issue is a too long classpath, the softlink thingy will give
> some room to breath as the total CP length will be much smaller.
>
> A
> On Thu, Oct 7, 2010 at 3:43 PM, Henning Blohm <henning.blohm@zfabrik.de>
> wrote:
>>
>> So that's actually another issue, right? Besides splitting the classpath
>> into those three groups, you want the TT to create soft-links on demand to
>> simplify the computation of classpath string. Is that right?
>>
>> But it's the TT that actually starts the job VM. Why does it matter what
>> the string actually looks like, as long as it has the right content?
>>
>> Thanks,
>>   Henning
>>
>> On Thu, 2010-10-07 at 13:22 +0800, Alejandro Abdelnur wrote:
>>
>> [sent too soon]
>>
>> The first CP shown is how it is today the CP of a task. If we change it
>> pick up all the job JARs from the current dir, then the classpath will be
>> much shorter (second CP shown). We can easily achieve this by soft-linking
>> the job JARs in the work dir of the task.
>>
>> Alejandro
>>
>> On Thu, Oct 7, 2010 at 1:02 PM, Alejandro Abdelnur <tucu@cloudera.com>
>> wrote:
>>
>> Fragmentation of Hadoop classpaths is another issue: hadoop should
>> differentiate the CP in 3:
>>
>> 1*client CP: what is needed to submit a job (only the nachos)
>>
>> 2*server CP (JT/NN/TT/DD): what is need to run the cluster (the whole
>> enchilada)
>>
>> 3*job CP: what is needed to run a job (some of the enchilada)
>>
>>
>> But i'm not trying to get into that here. What I'm suggesting is:
>>
>>
>>
>> -----
>>
>> # Hadoop JARs:
>>
>> /Users/tucu/dev-apps/hadoop/conf
>>
>>
>> /System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home/lib/tools.jar
>>
>> /Users/tucu/dev-apps/hadoop/bin/..
>>
>> /Users/tucu/dev-apps/hadoop/bin/../hadoop-core-0.20.3-CDH3-SNAPSHOT.jar
>>
>> /Users/tucu/dev-apps/hadoop/bin/../lib/aspectjrt-1.6.5.jar
>>
>> ..... (about 30 jars from hadoop lib/ )
>>
>> /Users/tucu/dev-apps/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar
>>
>> # Job JARs (for a job with only 2 JARs):
>>
>>
>> /Users/tucu/dev-apps/hadoop/dirs/mapred/taskTracker/distcache/-2707763075630339038_639898034_1993697040/localhost/user/tucu/oozie-tucu/0000003-101004184132247-oozie-tucu-W/java-node--java/java-launcher.jar
>>
>>
>> /Users/tucu/dev-apps/hadoop/dirs/mapred/taskTracker/distcache/3613772770922728555_-588832047_1993624983/localhost/user/tucu/examples/apps/java-main/lib/oozie-examples-2.2.1-CDH3B3-SNAPSHOT.jar
>>
>>
>> /Users/tucu/dev-apps/hadoop/dirs/mapred/taskTracker/tucu/jobcache/job_201010041326_0058/attempt_201010041326_0058_m_000000_0/work
>>
>> -----
>>
>>
>>
>> What I'm suggesting is that the later group, the job JARs to be
>> soft-linked (by the TT) into the working directory, then their classpath is
>> just:
>>
>> -----
>>
>> java-launcher.jar
>>
>> oozie-examples-2.2.1-CDH3B3-SNAPSHOT.jar
>>
>> .
>>
>> -----
>>
>>
>>
>>
>> Alejandro
>>
>> On Wed, Oct 6, 2010 at 7:57 PM, Henning Blohm <henning.blohm@zfabrik.de>
>> wrote:
>>
>> Hi Alejandro,
>>
>>    yes, it can of course be done right (sorry if my wording seemed to
>> imply otherwise). Just saying that I think that Hadoop M/R should not go
>> into that class loader / module separation business. It's one Job, one VM,
>> right? So the problem is to assign just the stuff needed to let the Job do
>> its business without becoming an obstacle.
>>
>>   Must admit I didn't understand your proposal 2. How would that remove
>> (e.g.) jetty libs from the job's classpath?
>>
>> Thanks,
>>   Henning
>>
>> Am Mittwoch, den 06.10.2010, 18:28 +0800 schrieb Alejandro Abdelnur:
>>
>> 1. Classloader business can be done right. Actually it could be done as
>> spec-ed for servlet web-apps.
>>
>>
>> 2. If the issue is strictly 'too large classpath', then a simpler solution
>> would be to sof-link all JARs to the current directory and create the
>> classpath with the JAR names only (no path). Note that the soft-linking
>> business is already supported by the DistributedCache. So the changes would
>> be mostly in the TT to create the JAR names only classpath before starting
>> the child.
>>
>>
>> Alejandro
>>
>>
>> On Wed, Oct 6, 2010 at 5:57 PM, Henning Blohm <henning.blohm@zfabrik.de>
>> wrote:
>>
>> Hi Tom,
>>
>>   that's exactly it. Thanks! I don't think that I can comment on the
>> issues in Jira so I will do it here.
>>
>>   Tricking with class paths and deviating from the default class loading
>> delegation has never been anything but a short term relieve. Fixing things
>> by imposing a "better" order of stuff on the class path will not work when
>> people do actually use child loaders (as the parent win) - like we do. Also
>> it may easily lead to very confusing situations because the former part of
>> the class path is not complete and gets other stuff from a latter part etc.
>> etc.... no good.
>>
>>   Child loaders are good for module separation but should not be used to
>> "hide" type visibiliy from the parent. Almost certainly leading to Class
>> Loader Contraint Violation - once you lose control (which is usually earlier
>> than expected).
>>
>>   The suggestion to reduce the Job class path to the required minimum is
>> the most practical approach. There is some gray area there of course and it
>> will not be feasible to reach the absolute minimal set of types there - but
>> something reasonable, i.e. the hadoop core that suffices to run the job.
>> Certainly jetty & co are not required for job execution (btw. I "hacked"
>> 0.20.2 to remove anything in "server/" from the classpath before setting the
>> job class path).
>>
>>   I would suggest to
>>
>>   a) introduce some HADOOP_JOB_CLASSPATH var that, if set, is the
>> additional classpath, added to the "core" classpath (as described above). If
>> not set, for compatibility, preserve today's behavior.
>>   b) not getting into custom child loaders for jobs as part of hadoop M/R.
>> It's non-trivial to get it right and feels to be beyond scope.
>>
>>   I wouldn't mind helping btw.
>>
>> Thanks,
>>   Henning
>>
>>
>>
>>
>> On Tue, 2010-10-05 at 15:59 -0700, Tom White wrote:
>>
>> Hi Henning,
>>
>> I don't know if you've seen
>> https://issues.apache.org/jira/browse/MAPREDUCE-1938 and
>> https://issues.apache.org/jira/browse/MAPREDUCE-1700 which have
>> discussion about this issue.
>>
>> Cheers
>> Tom
>>
>> On Fri, Sep 24, 2010 at 3:41 AM, Henning Blohm <henning.blohm@zfabrik.de>
>> wrote:
>> > Short update on the issue:
>> >
>> > I tried to find a way to separate class path configurations by modifying
>> > the
>> > scripts in HADOOP_HOME/bin but found that TaskRunner actually copies the
>> > class path setting from the parent process when starting a local task so
>> > that I do not see a way of having less on a job's classpath without
>> > modifying Hadoop.
>> >
>> > As that will present a real issue when running our jobs on Hadoop I
>> > would
>> > like to propose to change TaskRunner so that it sets a class path
>> > specifically for M/R tasks. That class path could be defined in the
>> > scipts
>> > (as for the other processes) using a particular environment variable
>> > (e.g.
>> > HADOOP_JOB_CLASSPATH). It could default to the current VM's class path,
>> > preserving today's behavior.
>> >
>> > Is it ok to enter this as an issue?
>> >
>> > Thanks,
>> >   Henning
>> >
>> >
>> > Am Freitag, den 17.09.2010, 16:01 +0000 schrieb Allen Wittenauer:
>> >
>> > On Sep 17, 2010, at 4:56 AM, Henning Blohm wrote:
>> >
>> >> When running map reduce tasks in Hadoop I run into classpath issues.
>> >> Contrary to previous posts, my problem is not that I am missing classes
>> >> on
>> >> the Task's class path (we have a perfect solution for that) but rather
>> >> find
>> >> too many (e.g. ECJ classes or jetty).
>> >
>> > The fact that you mention:
>> >
>> >> The libs in HADOOP_HOME/lib seem to contain everything needed to run
>> >> anything in Hadoop which is, I assume, much more than is needed to run
>> >> a map
>> >> reduce task.
>> >
>> > hints that your perfect solution is to throw all your custom stuff in
>> > lib.
>> > If so, that's a huge mistake.  Use distributed cache instead.
>> >
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>

--=-X8X8waDbPtbSSllu+b0I--