Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 48697 invoked from network); 15 Aug 2007 18:07:56 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Aug 2007 18:07:56 -0000 Received: (qmail 79564 invoked by uid 500); 15 Aug 2007 18:07:53 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 79532 invoked by uid 500); 15 Aug 2007 18:07:53 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 79523 invoked by uid 99); 15 Aug 2007 18:07:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Aug 2007 11:07:52 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Aug 2007 18:08:10 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B0ECF7141F5 for ; Wed, 15 Aug 2007 11:07:31 -0700 (PDT) Message-ID: <14360458.1187201251720.JavaMail.jira@brutus> Date: Wed, 15 Aug 2007 11:07:31 -0700 (PDT) From: "Sameer Paranjpye (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-785) Divide the server and client configurations In-Reply-To: <4048066.1165393281134.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520051 ] Sameer Paranjpye commented on HADOOP-785: ----------------------------------------- {quote} Essentially we need 3 config files: a) Read-only defaults (existing hadoop-defaults.xml). b) A file where the admin specifies config values which can be overridden (existing mapred-defaults.xml). c) A file where the admin specifies a set of hard, sane limits for some config values which cannot be overridden (existing hadoop-site.xml). {quote} I don't think we need 3 config files or a hierarchy of configs. The above 3 categories of configuration need to exist, but can be expressed in many different ways. What if we had the following files: - _hadoop-defaults.xml_, the read-only default config file - _hadoop-client.xml_, specifies client behavior, resides on a client machine is processed by clients - _hadoop-server.xml_, specifies server behavior is processed by servers The one place where the client and server configs interact is when tasks are localized and clients are running in a server controlled context. Here some of the clients configuration can be overridden by values in the servers config. The variables to be overridden can be hard coded. If this means we're overprotecting users, then the list of variables to override can itself be placed in the server config, say in the hadoop.client.overrides config variable. The treatment of the 3 categories of config values would be as follows: - Read-only defaults - _hadoop-defaults.xml_ - Admin specified config values which can be overridden - This set of values no longer exists, everything can be overridden by clients with a few exceptions, all client configuration appears in _hadoop-client.xml_ - Admin specified set of hard, sane limits for some config values which cannot be overridden - This is a set of exceptions listed in _hadoop-server.xml_, represented by the config value _hadoop.client.override_ > Divide the server and client configurations > ------------------------------------------- > > Key: HADOOP-785 > URL: https://issues.apache.org/jira/browse/HADOOP-785 > Project: Hadoop > Issue Type: Improvement > Components: conf > Affects Versions: 0.9.0 > Reporter: Owen O'Malley > Assignee: Arun C Murthy > Fix For: 0.15.0 > > > The configuration system is easy to misconfigure and I think we need to strongly divide the server from client configs. > An example of the problem was a configuration where the task tracker has a hadoop-site.xml that set mapred.reduce.tasks to 1. Therefore, the job tracker had the right number of reduces, but the map task thought there was a single reduce. This lead to a hard to find diagnose failure. > Therefore, I propose separating out the configuration types as: > class Configuration; > // reads site-default.xml, hadoop-default.xml > class ServerConf extends Configuration; > // reads hadoop-server.xml, $super > class DfsServerConf extends ServerConf; > // reads dfs-server.xml, $super > class MapRedServerConf extends ServerConf; > // reads mapred-server.xml, $super > class ClientConf extends Configuration; > // reads hadoop-client.xml, $super > class JobConf extends ClientConf; > // reads job.xml, $super > Note in particular, that nothing corresponds to hadoop-site.xml, which overrides both client and server configs. Furthermore, the properties from the *-default.xml files should never be saved into the job.xml. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.