hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-88) Configuration: separate client config from server config (and from other-server config)
Date Wed, 28 Jun 2006 21:11:53 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-88?page=all ]

Doug Cutting updated HADOOP-88:

    Fix Version: 0.5.0
                     (was: 0.4.0)

> Configuration: separate client config from server config (and from other-server config)
> ---------------------------------------------------------------------------------------
>          Key: HADOOP-88
>          URL: http://issues.apache.org/jira/browse/HADOOP-88
>      Project: Hadoop
>         Type: Wish

>   Components: conf
>     Reporter: Michel Tourn
>     Priority: Minor
>      Fix For: 0.5.0

> servers = JobTracker, NameNode, TaskTracker, DataNode
> clients =  runs JobClient (to submit MapReduce jobs), or runs DFSShell (to browse )
> Server machines are administered together.
> So it is OK to have all server config together (esp file paths and network ports).
> This is stored in hadoop-default.xml or hadoop-mycluster.xml
> Client machines:
> there may be as many client machines as there are MapRed developers.
> the temp space for DFS needs to be writable by the active user.
> So it should be possible to select the client temp space directory for the machine and
for the user.
> (The global /tmp is not an option as discussed elsewhere: partition may be full)
> Current situation: 
> Both the server and the clients have a copy of the server config: hadoop-default.xml
> But the XML property  "dfs.data.dir" is being used as a LOCAL directory path 
> on both the server machines (Data nodes) and the client machines.
> Effect:
> Exception in thread "main" java.io.IOException: No valid local directories in property:
>  at org.apache.hadoop.conf.Configuration.getFile(Configuration.java:286)
>  at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.newBackupFile(DFSClient.java:560)
>  ...
>  at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:267)
> Current Workaround:
> On the client use hadoop-site.xml to override dfs.data.dir
> One proposed solution:
> For the purpose of JobClient operations, use a different property in place of dfs.data.dir.
> (Ex: dfs.client.data.dir) 
> On the client, set this property in hadoop-site.xml so that it will override hadoop-default.xml

> Another proposed solution:
> Handle the fact that the world is made of a federation of independant Hadoop systems.
> They can talk to each other (as peers) but they are administered separately.
> Each Hadoop system should have its own separate XML config file.
> Clients should be able to specify the Hadoop system they want to talk to.
> An advantage is that clients can then easily sync their local copy of a given Hadoop
system config:
>  just pull its config file
> In this view of the world, a Job client is also a kind of independant (serverless) Hadoop
> In this case the client config file may have its own dfs.data.dir, which is 
> separate from the dfs.data.dir in the server config file.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message