Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 94074 invoked from network); 27 Oct 2007 19:23:16 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Oct 2007 19:23:16 -0000 Received: (qmail 56830 invoked by uid 500); 27 Oct 2007 19:23:02 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 56790 invoked by uid 500); 27 Oct 2007 19:23:02 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 56781 invoked by uid 99); 27 Oct 2007 19:23:02 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 27 Oct 2007 12:23:02 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 27 Oct 2007 19:23:17 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 8E28C71425B for ; Sat, 27 Oct 2007 12:22:51 -0700 (PDT) Message-ID: <19186719.1193512971579.JavaMail.jira@brutus> Date: Sat, 27 Oct 2007 12:22:51 -0700 (PDT) From: "Dennis Kubes (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-1622) Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on In-Reply-To: <17905327.1184698384749.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538245 ] Dennis Kubes commented on HADOOP-1622: -------------------------------------- 1. Could you please remove the mention of 'final' and 'default' config resources from the javadoc for JobConf.{get|set}JobResources? They are no longer relevant vis-a-vis hadoop Configuration. I have removed the mention of final and default resources. 2. Should we also have a JobConf.setJobResource along with JobConf.addJobResource, ala {{DistributedCache} apis? I had debated about set vs add resources. The current behavior is when you add a resource you are appending it to a list of resources as opposed to setting a resource which would clear anything previously added and add only that resource. Since many times jar resources are added by including the jar file which contains a given class, I thought it better to NOT allow clearing and resetting of job resources. 3. Should we move the private JobClient.createJobJar method to JarUtils to make it available as a useful utility? I debated about this too. JarUtils was generic jaring and unjaring utilities. But I don't see harm in putting createJobJar in and I think you are right we may need that somewhere else in the future. I have remvoed from JobClient and added to JarUtils. Unrelated: Does it make sense to rename Configuration.addResource to Configuration.addConfigResource? I wonder how confusing these unrelated api names are, given JobConf is a Configuration to Yeah, debated about this one too. In the end we weren't just adding jars but multiple things such as classes, exe, files. Couldn't find a better name for that then resource. I put it as jobResource to be a little less confusing. Changing Configuration over to configResource would be good I think, Although we should probably deprecate because a lot of things rely on that method. I am currently testing patch 9, will have it posted shortly. > Hadoop should provide a way to allow the user to specify jar file(s) the user job depends on > -------------------------------------------------------------------------------------------- > > Key: HADOOP-1622 > URL: https://issues.apache.org/jira/browse/HADOOP-1622 > Project: Hadoop > Issue Type: Improvement > Reporter: Runping Qi > Assignee: Dennis Kubes > Fix For: 0.16.0 > > Attachments: hadoop-1622-4-20071008.patch, HADOOP-1622-5.patch, HADOOP-1622-6.patch, HADOOP-1622-7.patch, HADOOP-1622-8.patch, multipleJobJars.patch, multipleJobResources.patch, multipleJobResources2.patch > > > More likely than not, a user's job may depend on multiple jars. > Right now, when submitting a job through bin/hadoop, there is no way for the user to specify that. > A walk around for that is to re-package all the dependent jars into a new jar or put the dependent jar files in the lib dir of the new jar. > This walk around causes unnecessary inconvenience to the user. Furthermore, if the user does not own the main function > (like the case when the user uses Aggregate, or datajoin, streaming), the user has to re-package those system jar files too. > It is much desired that hadoop provides a clean and simple way for the user to specify a list of dependent jar files at the time > of job submission. Someting like: > bin/hadoop .... --depending_jars j1.jar:j2.jar -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.