Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <19186719.1193512971579.JavaMail.jira@brutus>
Date: Sat, 27 Oct 2007 12:22:51 -0700 (PDT)
From: "Dennis Kubes (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Commented: (HADOOP-1622) Hadoop should provide a way to
 allow the user to specify jar file(s) the user job depends on
In-Reply-To: <17905327.1184698384749.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538245 ] 

Dennis Kubes commented on HADOOP-1622:
--------------------------------------

1. Could you please remove the mention of 'final' and 'default' config resources from the javadoc for JobConf.{get|set}JobResources? They are no longer relevant vis-a-vis hadoop Configuration.

I have removed the mention of final and default resources.

2. Should we also have a JobConf.setJobResource along with JobConf.addJobResource, ala {{DistributedCache} apis?

I had debated about set vs add resources.  The current behavior is when you add a resource you are appending it to a list of resources as opposed to setting a resource which would clear anything previously added and add only that resource.  Since many times jar resources are added by including the jar file which contains a given class, I thought it better to NOT allow clearing and resetting of job resources.

3. Should we move the private JobClient.createJobJar method to JarUtils to make it available as a useful utility?

I debated about this too.  JarUtils was generic jaring and unjaring utilities.  But I don't see harm in putting createJobJar in and I think you are right we may need that somewhere else in the future.  I have remvoed from JobClient and added to JarUtils.

Unrelated: Does it make sense to rename Configuration.addResource to Configuration.addConfigResource? I wonder how confusing these unrelated api names are, given JobConf is a Configuration to

Yeah, debated about this one too.  In the end we weren't just adding jars but multiple things such as classes, exe, files.  Couldn't find a better name  for that then resource.  I put it as jobResource to be a little less confusing.  Changing Configuration over to configResource would be good I think, Although we should probably deprecate because a lot of things rely on that method.

I am currently testing patch 9, will have it posted shortly.

> Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1622
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1622
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>            Assignee: Dennis Kubes
>             Fix For: 0.16.0
>
>         Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch
>
>
> More likely than not, a user's job may depend on multiple jars.
> Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that. 
> A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar.
> This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function 
> (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too.
> It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time 
> of job submission. Someting like:
> bin/hadoop .... --depending_jars j1.jar:j2.jar 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.