Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0EC4079E9 for ; Thu, 29 Dec 2011 04:22:58 +0000 (UTC) Received: (qmail 96571 invoked by uid 500); 29 Dec 2011 04:22:57 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 96550 invoked by uid 500); 29 Dec 2011 04:22:57 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 96522 invoked by uid 99); 29 Dec 2011 04:22:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Dec 2011 04:22:54 +0000 X-ASF-Spam-Status: No, hits=-2001.3 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Dec 2011 04:22:52 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id A3A7512E0CB for ; Thu, 29 Dec 2011 04:22:30 +0000 (UTC) Date: Thu, 29 Dec 2011 04:22:30 +0000 (UTC) From: "Roman Shaposhnik (Commented) (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <157106351.50411.1325132550672.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <847678263.39953.1324579171443.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HADOOP-7939) Improve Hadoop subcomponent integration in Hadoop 0.23 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13176982#comment-13176982 ] Roman Shaposhnik commented on HADOOP-7939: ------------------------------------------ @Allen, What you're saying makes sense. Two questions though: # Are you saying the proposed solution is worse than the current mess of naming we've got in our scripts (once again, I fully agree with your general point -- I'm just curious if it extends to the present situation as well) # Given that a uniformity in naming would allow us to handle all such var evals/manipualtions in a single place (basically having a script that can act as yarn/hdfs/mapred,etc) wouldn't you agree that addressing things like de-dupping will become easier compared to doing such de-dupping in half a dozen different places? > Improve Hadoop subcomponent integration in Hadoop 0.23 > ------------------------------------------------------ > > Key: HADOOP-7939 > URL: https://issues.apache.org/jira/browse/HADOOP-7939 > Project: Hadoop Common > Issue Type: Improvement > Components: build, conf, documentation, scripts > Affects Versions: 0.23.0 > Reporter: Roman Shaposhnik > Assignee: Roman Shaposhnik > Fix For: 0.23.1 > > > h1. Introduction > For the rest of this proposal it is assumed that the current set > of Hadoop subcomponents is: > * hadoop-common > * hadoop-hdfs > * hadoop-yarn > * hadoop-mapreduce > It must be noted that this is an open ended list, though. For example, > implementations of additional frameworks on top of yarn (e.g. MPI) would > also be considered a subcomponent. > h1. Problem statement > Currently there's an unfortunate coupling and hard-coding present at the > level of launcher scripts, configuration scripts and Java implementation > code that prevents us from treating all subcomponents of Hadoop independently > of each other. In a lot of places it is assumed that bits and pieces > from individual subcomponents *must* be located at predefined places > and they can not be dynamically registered/discovered during the runtime. > This prevents a truly flexible deployment of Hadoop 0.23. > h1. Proposal > NOTE: this is NOT a proposal for redefining the layout from HADOOP-6255. > The goal here is to keep as much of that layout in place as possible, > while permitting different deployment layouts. > The aim of this proposal is to introduce the needed level of indirection and > flexibility in order to accommodate the current assumed layout of Hadoop tarball > deployments and all the other styles of deployments as well. To this end the > following set of environment variables needs to be uniformly used in all of > the subcomponent's launcher scripts, configuration scripts and Java code > ( stands for a literal name of a subcomponent). These variables are > expected to be defined by -env.sh scripts and sourcing those files is > expected to have the desired effect of setting the environment up correctly. > # HADOOP__HOME > ## root of the subtree in a filesystem where a subcomponent is expected to be installed > ## default value: $0/.. > # HADOOP__JARS > ## a subdirectory with all of the jar files comprising subcomponent's implementation > ## default value: $(HADOOP__HOME)/share/hadoop/$() > # HADOOP__EXT_JARS > ## a subdirectory with all of the jar files needed for extended functionality of the subcomponent (nonessential for correct work of the basic functionality) > ## default value: $(HADOOP__HOME)/share/hadoop/$()/ext > # HADOOP__NATIVE_LIBS > ## a subdirectory with all the native libraries that component requires > ## default value: $(HADOOP__HOME)/share/hadoop/$()/native > # HADOOP__BIN > ## a subdirectory with all of the launcher scripts specific to the client side of the component > ## default value: $(HADOOP__HOME)/bin > # HADOOP__SBIN > ## a subdirectory with all of the launcher scripts specific to the server/system side of the component > ## default value: $(HADOOP__HOME)/sbin > # HADOOP__LIBEXEC > ## a subdirectory with all of the launcher scripts that are internal to the implementation and should *not* be invoked directly > ## default value: $(HADOOP__HOME)/libexec > # HADOOP__CONF > ## a subdirectory containing configuration files for a subcomponent > ## default value: $(HADOOP__HOME)/conf > # HADOOP__DATA > ## a subtree in the local filesystem for storing component's persistent state > ## default value: $(HADOOP__HOME)/data > # HADOOP__LOG > ## a subdirectory for subcomponents's log files to be stored > ## default value: $(HADOOP__HOME)/log > # HADOOP__RUN > ## a subdirectory with runtime system specific information > ## default value: $(HADOOP__HOME)/run > # HADOOP__TMP > ## a subdirectory with temprorary files > ## default value: $(HADOOP__HOME)/tmp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira