hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Eastman" <jeast...@collab.net>
Subject RE: DFS Block Allocation
Date Thu, 20 Dec 2007 22:35:09 GMT
Ted,

I'm still learning, obviously. I was not aware one could upload from any
machine other than the master (which did seem overly restrictive), and
uploading from one outside the cloud would be even better. Can you give
me a pointer on how to accomplish this? Is there a relevant FAQ or
documents I have missed?

My experience with balancing is similar to yours; the upload is uniform,
independent of disk size or availability. I will try rebalancing.

Thanks,
Jeff

-----Original Message-----
From: Ted Dunning [mailto:tdunning@veoh.com] 
Sent: Thursday, December 20, 2007 12:02 PM
To: hadoop-user@lucene.apache.org
Subject: Re: DFS Block Allocation


Yes.

I try to always upload data from a machine that is not part of the
cluster
for exactly that reason.

I still find that I need to rebalance due to a strange problem in
placement.
My datanodes have 10x different sized HDFS disks and I suspect that the
upload is picking datanodes uniformly rather than according to available
space.

Oddly enough, my rebalancing code works well.  All it does is iterate
through all files of interest, increasing the replication count for 30
seconds and then decreasing it again (obviously this has to thread to
manipulate more than 2 files per minute).  The replication code seems to
select a home for new blocks more correctly than the original placement.


On 12/20/07 10:16 AM, "Jeff Eastman" <jeastman@collab.net> wrote:

> Noting your use of the word "attempts", can I conclude that at some
point it
> might be impossible to upload blocks from a local file to the DFS on
the same
> node and at that point the blocks would all be loaded elsewhere?


Mime
View raw message