hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-785) Divide the server and client configurations
Date Mon, 30 Apr 2007 20:52:15 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492789

Doug Cutting commented on HADOOP-785:

Some comments on Milind's proposal:

I'm unclear on the difference between a ClientConfiguration and an AppConfiguration.  I'm
also not certain that configurations w/o setters will be practical.  MapReduce's daemons do
need to distinguish between the server's own configuration and the JobConf, but those should
already be completely distinct, no?

I prefer not using -final in the config file names.  I'd vote for using -default for defaults
and leaving overrides unmarked (server.xml, client.xml).  Either that or we should use more
clearly opposite terms, like initial/final, default/override, etc, but unmarked would be my
first choice.

I don't see the need for both client-default and common-default, nor both client-final and
common-final.  Can you give examples of lists of things that would go in each file?  A major
goal of this redesign is that it should always be very clear which file one should specify
a parameter in.  Many of our servers are also clients (e.g., JobTracker uses FileSystem) but
pure client code is never a Hadoop daemon.  So it's possible to determine which parameters
should only be read by daemon code, but it's harder to determine parameters which should never
be read by daemon code.  Hence it's possible to have server-only configurations, but I'm not
sure it makes sense to have client-only configurations.

> Divide the server and client configurations
> -------------------------------------------
>                 Key: HADOOP-785
>                 URL: https://issues.apache.org/jira/browse/HADOOP-785
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.9.0
>            Reporter: Owen O'Malley
>         Assigned To: Milind Bhandarkar
> The configuration system is easy to misconfigure and I think we need to strongly divide
the server from client configs. 
> An example of the problem was a configuration where the task tracker has a hadoop-site.xml
that set mapred.reduce.tasks to 1. Therefore, the job tracker had the right number of reduces,
but the map task thought there was a single reduce. This lead to a hard to find diagnose failure.
> Therefore, I propose separating out the configuration types as:
> class Configuration;
> // reads site-default.xml, hadoop-default.xml
> class ServerConf extends Configuration;
> // reads hadoop-server.xml, $super
> class DfsServerConf extends ServerConf;
> // reads dfs-server.xml, $super
> class MapRedServerConf extends ServerConf;
> // reads mapred-server.xml, $super
> class ClientConf extends Configuration;
> // reads hadoop-client.xml, $super
> class JobConf extends ClientConf;
> // reads job.xml, $super
> Note in particular, that nothing corresponds to hadoop-site.xml, which overrides both
client and server configs. Furthermore, the properties from the *-default.xml files should
never be saved into the job.xml.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message