Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 87637 invoked from network); 7 Aug 2006 22:15:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 7 Aug 2006 22:15:59 -0000 Received: (qmail 53253 invoked by uid 500); 7 Aug 2006 22:15:59 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 53106 invoked by uid 500); 7 Aug 2006 22:15:58 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 53097 invoked by uid 99); 7 Aug 2006 22:15:58 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Aug 2006 15:15:58 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [209.237.227.198] (HELO brutus.apache.org) (209.237.227.198) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Aug 2006 15:15:57 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 82BA1410010 for ; Mon, 7 Aug 2006 22:13:16 +0000 (GMT) Message-ID: <27521000.1154988796532.JavaMail.jira@brutus> Date: Mon, 7 Aug 2006 15:13:16 -0700 (PDT) From: "Bryan Pendleton (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-64) DataNode should be capable of managing multiple volumes In-Reply-To: <96561968.1141686362086.JavaMail.jira@ajax> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/HADOOP-64?page=comments#action_12426343 ] Bryan Pendleton commented on HADOOP-64: --------------------------------------- Why do datanodes need to checkpoint? What's the value of storing out the mapping, vs. re-enumerating them at startup time? The namenode doesn't keep track of what nodes have which blocks, why should a storage node keep track any more rigorously within its own state? I'd argue that all of that complexity is needless - the cost of maintaining a consistent state is way too high for little benefit. Please make it very easy to change the block-allocation code. The default behaviors of the current code have been causing troubles on my very heterogenous cluster for a very long time - uniform distribution only really actually makes sense if the same amount of space is available on each drive. For all other cases, doing this leads immediately to unnecessary failures. I'm not sure about the "blocks considered lost on read-only volumes" bit, but, if that implies that the blocks become unavailable, then I think the approach is too heavy-handed. Those blocks might be the only copies, and ignoring them means that cluster might not be able to find a live copy of a block anywhere else. Please clarify what a "lost" block is. > DataNode should be capable of managing multiple volumes > ------------------------------------------------------- > > Key: HADOOP-64 > URL: http://issues.apache.org/jira/browse/HADOOP-64 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Affects Versions: 0.2.0 > Reporter: Sameer Paranjpye > Assigned To: Milind Bhandarkar > Priority: Minor > Fix For: 0.6.0 > > > The dfs Datanode can only store data on a single filesystem volume. When a node runs its disks JBOD this means running a Datanode per disk on the machine. While the scheme works reasonably well on small clusters, on larger installations (several 100 nodes) it implies a very large number of Datanodes with associated management overhead in the Namenode. > The Datanod should be enhanced to be able to handle multiple volumes on a single machine. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira