hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-785) Divide the server and client configurations
Date Thu, 16 Aug 2007 03:49:30 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520139
] 

Arun C Murthy commented on HADOOP-785:
--------------------------------------

bq. It sounds like we're mostly in agreement.
Phew!  *smile*


With the only bone of contention being:
bq. Where we differ is what the files should be named and how the non-overrideable parameters
should be named [...]

and given:
bq. The override mechanism is not specific to mapreduce, since other daemons may wish to use
it in the future. We should also avoid the terms 'client' and 'server', since these are relative,
not universal. [...]

I'd still vote to go the hadoop-initial/hadoop-final way:
a) Introducing another {{hadoop.client.overrides}} is another pretty big deal, given it isn't
around today. It is only a few parameters today, but imagine if it keeps growing...
b) Doing overrides through code seems like a brittle solution, imho.
c) Most importantly: imagine trying to explain {{hadoop.client.overrides}} to people (users/admins)
new to hadoop... just seems like we would put a lot more onus on them to understand internals.

while:

d) the hadoop-initial/hadoop-final way seems, to me, a lot more generic & simpler, future-proof,
backward compatible conceptually and easier on admins.
 
Having said that, I promise this is my last comment on this debate. *smile*

Clearly I'd love to hear from other users about what they think is easier on them...

> Divide the server and client configurations
> -------------------------------------------
>
>                 Key: HADOOP-785
>                 URL: https://issues.apache.org/jira/browse/HADOOP-785
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.9.0
>            Reporter: Owen O'Malley
>            Assignee: Arun C Murthy
>             Fix For: 0.15.0
>
>
> The configuration system is easy to misconfigure and I think we need to strongly divide
the server from client configs. 
> An example of the problem was a configuration where the task tracker has a hadoop-site.xml
that set mapred.reduce.tasks to 1. Therefore, the job tracker had the right number of reduces,
but the map task thought there was a single reduce. This lead to a hard to find diagnose failure.
> Therefore, I propose separating out the configuration types as:
> class Configuration;
> // reads site-default.xml, hadoop-default.xml
> class ServerConf extends Configuration;
> // reads hadoop-server.xml, $super
> class DfsServerConf extends ServerConf;
> // reads dfs-server.xml, $super
> class MapRedServerConf extends ServerConf;
> // reads mapred-server.xml, $super
> class ClientConf extends Configuration;
> // reads hadoop-client.xml, $super
> class JobConf extends ClientConf;
> // reads job.xml, $super
> Note in particular, that nothing corresponds to hadoop-site.xml, which overrides both
client and server configs. Furthermore, the properties from the *-default.xml files should
never be saved into the job.xml.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message