hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1622) Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on
Date Sat, 27 Oct 2007 12:23:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538191
] 

Arun C Murthy commented on HADOOP-1622:
---------------------------------------

Dennis, I'm sorry to come in late on this... a couple of comments:

1. Could you please remove the mention of 'final' and 'default' config resources from the
javadoc for {{JobConf.{get|set}JobResources}}? They are no longer relevant vis-a-vis hadoop
Configuration.
2. Should we also have a {{JobConf.setJobResource}} along with {{JobConf.addJobResource}},
ala {{DistributedCache} apis?
3. Should we move the private {{JobClient.createJobJar}} method to JarUtils to make it available
as a useful utility?

Unrelated: Does it make sense to rename {{Configuration.addResource}} to {{Configuration.addConfigResource}}?
I wonder how confusing these unrelated api names are, given {{JobConf}} _is a_ {{Configuration}}
too ...


> Hadoop should provide a way to allow the user to specify jar file(s) the user job depends
on
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1622
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1622
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: Runping Qi
>            Assignee: Dennis Kubes
>             Fix For: 0.16.0
>
>         Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch,
HADOOP-1622-7.patch, HADOOP-1622-8.patch, multipleJobJars.patch, multipleJobResources.patch,
multipleJobResources2.patch
>
>
> More likely than not, a user's job may depend on multiple jars.
> Right now, when submitting a job through bin/hadoop, there is no way for the user to
specify that. 
> A walk around for that is to re-package all the dependent jars into a new jar or put
the dependent jar files in the lib dir of the new jar.
> This walk around causes unnecessary inconvenience to the user. Furthermore, if the user
does not own the main function 
> (like the case when the user uses Aggregate, or datajoin, streaming), the user has to
re-package those system jar files too.
> It is much desired that hadoop provides a clean and simple way for the user to specify
a list of dependent jar files at the time 
> of job submission. Someting like:
> bin/hadoop .... --depending_jars j1.jar:j2.jar 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message