hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Azuryy Yu <azury...@gmail.com>
Subject Re: Question on DFS Balancing
Date Wed, 05 Mar 2014 07:58:58 GMT
Hi,
That probably break something if you apply the patch from 2.x to 0.20.x,
but it depends on.

AFAIK, Balancer had a major refactor in HDFSv2, so you'd better fix it by
yourself based on HDFS-1804.



On Wed, Mar 5, 2014 at 3:47 PM, divye sheth <divs.sheth@gmail.com> wrote:

> Thanks Harsh. The jira is fixed in version 2.1.0 whereas I am using Hadoop
> 0.20.2 (we are in a process of upgrading) is there a workaround for the
> short term to balance the disk utilization? The patch in the Jira, if
> applied to the version that I am using, will it break anything?
>
> Thanks
> Divye Sheth
>
>
> On Wed, Mar 5, 2014 at 11:28 AM, Harsh J <harsh@cloudera.com> wrote:
>
>> You're probably looking for
>> https://issues.apache.org/jira/browse/HDFS-1804
>>
>> On Tue, Mar 4, 2014 at 5:54 AM, divye sheth <divs.sheth@gmail.com> wrote:
>> > Hi,
>> >
>> > I am new to the mailing list.
>> >
>> > I am using Hadoop 0.20.2 with an append r1056497 version. The question I
>> > have is related to balancing. I have a 5 datanode cluster and each node
>> has
>> > 2 disks attached to it. The second disk was added when the first disk
>> was
>> > reaching its capacity.
>> >
>> > Now the scenario that I am facing is, when the new disk was added hadoop
>> > automatically moved over some data to the new disk. But over the time I
>> > notice that data is no longer being written to the second disk. I have
>> also
>> > faced an issue on the datanode where the first disk had 100%
>> utilization.
>> >
>> > How can I overcome such scenario, is it not hadoop's job to balance the
>> disk
>> > utilization between multiple disks on single datanode?
>> >
>> > Thanks
>> > Divye Sheth
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Mime
View raw message