Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <14360458.1187201251720.JavaMail.jira@brutus>
Date: Wed, 15 Aug 2007 11:07:31 -0700 (PDT)
From: "Sameer Paranjpye (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Commented: (HADOOP-785) Divide the server and client
 configurations
In-Reply-To: <4048066.1165393281134.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HADOOP-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520051 ] 

Sameer Paranjpye commented on HADOOP-785:
-----------------------------------------

{quote}
Essentially we need 3 config files:
a) Read-only defaults (existing hadoop-defaults.xml).
b) A file where the admin specifies config values which can be overridden (existing mapred-defaults.xml).
c) A file where the admin specifies a set of hard, sane limits for some config values which cannot be overridden (existing hadoop-site.xml).
{quote}

I don't think we need 3 config files or a hierarchy of configs. The above 3 categories of configuration need to exist, but can be expressed in many different ways. What if we had the following files:

- _hadoop-defaults.xml_, the read-only default config file
- _hadoop-client.xml_, specifies client behavior, resides on a client machine is processed by clients
- _hadoop-server.xml_, specifies server behavior is processed by servers

The one place where the client and server configs interact is when tasks are localized and clients are running in a server controlled context. Here some of the clients configuration can be overridden by values in the servers config. The variables to be overridden can be hard coded. If this means we're overprotecting users, then  the list of variables to override can itself be placed in the server config, say in the hadoop.client.overrides config variable.

The treatment of the 3 categories of config values would be as follows:

- Read-only defaults - _hadoop-defaults.xml_
- Admin specified config values which can be overridden - This set of values no longer exists, everything can be overridden by clients with a few exceptions, all client configuration appears in _hadoop-client.xml_
- Admin specified set of hard, sane limits for some config values which cannot be overridden - This is a set of exceptions listed in _hadoop-server.xml_, represented by the config value _hadoop.client.override_


> Divide the server and client configurations
> -------------------------------------------
>
>                 Key: HADOOP-785
>                 URL: https://issues.apache.org/jira/browse/HADOOP-785
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.9.0
>            Reporter: Owen O'Malley
>            Assignee: Arun C Murthy
>             Fix For: 0.15.0
>
>
> The configuration system is easy to misconfigure and I think we need to strongly divide the server from client configs. 
> An example of the problem was a configuration where the task tracker has a hadoop-site.xml that set mapred.reduce.tasks to 1. Therefore, the job tracker had the right number of reduces, but the map task thought there was a single reduce. This lead to a hard to find diagnose failure.
> Therefore, I propose separating out the configuration types as:
> class Configuration;
> // reads site-default.xml, hadoop-default.xml
> class ServerConf extends Configuration;
> // reads hadoop-server.xml, $super
> class DfsServerConf extends ServerConf;
> // reads dfs-server.xml, $super
> class MapRedServerConf extends ServerConf;
> // reads mapred-server.xml, $super
> class ClientConf extends Configuration;
> // reads hadoop-client.xml, $super
> class JobConf extends ClientConf;
> // reads job.xml, $super
> Note in particular, that nothing corresponds to hadoop-site.xml, which overrides both client and server configs. Furthermore, the properties from the *-default.xml files should never be saved into the job.xml.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.