hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sameer Paranjpye (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-785) Divide the server and client configurations
Date Wed, 15 Aug 2007 18:07:31 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520051

Sameer Paranjpye commented on HADOOP-785:

Essentially we need 3 config files:
a) Read-only defaults (existing hadoop-defaults.xml).
b) A file where the admin specifies config values which can be overridden (existing mapred-defaults.xml).
c) A file where the admin specifies a set of hard, sane limits for some config values which
cannot be overridden (existing hadoop-site.xml).

I don't think we need 3 config files or a hierarchy of configs. The above 3 categories of
configuration need to exist, but can be expressed in many different ways. What if we had the
following files:

- _hadoop-defaults.xml_, the read-only default config file
- _hadoop-client.xml_, specifies client behavior, resides on a client machine is processed
by clients
- _hadoop-server.xml_, specifies server behavior is processed by servers

The one place where the client and server configs interact is when tasks are localized and
clients are running in a server controlled context. Here some of the clients configuration
can be overridden by values in the servers config. The variables to be overridden can be hard
coded. If this means we're overprotecting users, then  the list of variables to override can
itself be placed in the server config, say in the hadoop.client.overrides config variable.

The treatment of the 3 categories of config values would be as follows:

- Read-only defaults - _hadoop-defaults.xml_
- Admin specified config values which can be overridden - This set of values no longer exists,
everything can be overridden by clients with a few exceptions, all client configuration appears
in _hadoop-client.xml_
- Admin specified set of hard, sane limits for some config values which cannot be overridden
- This is a set of exceptions listed in _hadoop-server.xml_, represented by the config value


> Divide the server and client configurations
> -------------------------------------------
>                 Key: HADOOP-785
>                 URL: https://issues.apache.org/jira/browse/HADOOP-785
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.9.0
>            Reporter: Owen O'Malley
>            Assignee: Arun C Murthy
>             Fix For: 0.15.0
> The configuration system is easy to misconfigure and I think we need to strongly divide
the server from client configs. 
> An example of the problem was a configuration where the task tracker has a hadoop-site.xml
that set mapred.reduce.tasks to 1. Therefore, the job tracker had the right number of reduces,
but the map task thought there was a single reduce. This lead to a hard to find diagnose failure.
> Therefore, I propose separating out the configuration types as:
> class Configuration;
> // reads site-default.xml, hadoop-default.xml
> class ServerConf extends Configuration;
> // reads hadoop-server.xml, $super
> class DfsServerConf extends ServerConf;
> // reads dfs-server.xml, $super
> class MapRedServerConf extends ServerConf;
> // reads mapred-server.xml, $super
> class ClientConf extends Configuration;
> // reads hadoop-client.xml, $super
> class JobConf extends ClientConf;
> // reads job.xml, $super
> Note in particular, that nothing corresponds to hadoop-site.xml, which overrides both
client and server configs. Furthermore, the properties from the *-default.xml files should
never be saved into the job.xml.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message