Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <1434048.1150307790573.JavaMail.jira@brutus>
Date: Wed, 14 Jun 2006 17:56:30 +0000 (GMT+00:00)
From: "Konstantin Shvachko (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Commented: (HADOOP-297) When selecting node to put new block
 on, give priority to those with more free space/less blocks
In-Reply-To: <16202595.1150215869849.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

    [ http://issues.apache.org/jira/browse/HADOOP-297?page=comments#action_12416234 ] 

Konstantin Shvachko commented on HADOOP-297:
--------------------------------------------

Implementing a good weight function is a non-trivial problem.
It is still a very good thing to implement the framework that would support
prioritized queue of nodes with a simple weight function (=remaining disk space) for a starter.
The function could be fine tuned later on.
The rebalancing/migration thread might be a separate task.

> When selecting node to put new block on, give priority to those with more free space/less blocks
> ------------------------------------------------------------------------------------------------
>
>          Key: HADOOP-297
>          URL: http://issues.apache.org/jira/browse/HADOOP-297
>      Project: Hadoop
>         Type: Improvement

>   Components: dfs
>     Versions: 0.3.2
>     Reporter: Johan Oskarson
>     Priority: Minor
>  Attachments: priorityshuffle_v1.patch
>
> As mentioned in previous bug report:
> We're running a smallish cluster with very different machines, some with only 60 gb harddrives
> This creates a problem when inserting files into the dfs, these machines run out of space quickly while some have plenty of space free.
> So instead of just shuffling the nodes, I've created a quick patch that first sorts the target nodes by (freespace / blocks).
> It then randomizes the position of the first third of the nodes (so we don't put all the blocks in the file on the same machine)
> I'll let you guys figure out how to improve this.
> /Johan

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira