Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <19607030.1151529113364.JavaMail.jira@brutus>
Date: Wed, 28 Jun 2006 21:11:53 +0000 (GMT+00:00)
From: "Doug Cutting (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Updated: (HADOOP-88) Configuration: separate client config
 from server config (and from other-server config)
In-Reply-To: <1512399529.1142558161467.JavaMail.jira@ajax>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

     [ http://issues.apache.org/jira/browse/HADOOP-88?page=all ]

Doug Cutting updated HADOOP-88:
-------------------------------

    Fix Version: 0.5.0
                     (was: 0.4.0)

> Configuration: separate client config from server config (and from other-server config)
> ---------------------------------------------------------------------------------------
>
>          Key: HADOOP-88
>          URL: http://issues.apache.org/jira/browse/HADOOP-88
>      Project: Hadoop
>         Type: Wish

>   Components: conf
>     Reporter: Michel Tourn
>     Priority: Minor
>      Fix For: 0.5.0

>
> servers = JobTracker, NameNode, TaskTracker, DataNode
> clients =  runs JobClient (to submit MapReduce jobs), or runs DFSShell (to browse )
> Server machines are administered together.
> So it is OK to have all server config together (esp file paths and network ports).
> This is stored in hadoop-default.xml or hadoop-mycluster.xml
> Client machines:
> there may be as many client machines as there are MapRed developers.
> the temp space for DFS needs to be writable by the active user.
> So it should be possible to select the client temp space directory for the machine and for the user.
> (The global /tmp is not an option as discussed elsewhere: partition may be full)
> Current situation: 
> Both the server and the clients have a copy of the server config: hadoop-default.xml
> But the XML property  "dfs.data.dir" is being used as a LOCAL directory path 
> on both the server machines (Data nodes) and the client machines.
> Effect:
> Exception in thread "main" java.io.IOException: No valid local directories in property: dfs.data.dir
>  at org.apache.hadoop.conf.Configuration.getFile(Configuration.java:286)
>  at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.newBackupFile(DFSClient.java:560)
>  ...
>  at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:267)
> Current Workaround:
> On the client use hadoop-site.xml to override dfs.data.dir
> One proposed solution:
> For the purpose of JobClient operations, use a different property in place of dfs.data.dir.
> (Ex: dfs.client.data.dir) 
> On the client, set this property in hadoop-site.xml so that it will override hadoop-default.xml 
> Another proposed solution:
> Handle the fact that the world is made of a federation of independant Hadoop systems.
> They can talk to each other (as peers) but they are administered separately.
> Each Hadoop system should have its own separate XML config file.
> Clients should be able to specify the Hadoop system they want to talk to.
> An advantage is that clients can then easily sync their local copy of a given Hadoop system config:
>  just pull its config file
> In this view of the world, a Job client is also a kind of independant (serverless) Hadoop system
> In this case the client config file may have its own dfs.data.dir, which is 
> separate from the dfs.data.dir in the server config file.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira