hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Azuryy Yu <azury...@gmail.com>
Subject Re: Question on DFS Balancing
Date Wed, 05 Mar 2014 10:09:43 GMT
It don't need any downtime. just like Balancer, but this tool move blocks
peer to peer. you specified source node and destination node. then start.


On Wed, Mar 5, 2014 at 5:12 PM, divye sheth <divs.sheth@gmail.com> wrote:

> Does this require any downtime? I guess it should and any other
> precautions that I should take?
> Thanks Azuryy.
>
>
> On Wed, Mar 5, 2014 at 2:19 PM, Azuryy Yu <azuryyyu@gmail.com> wrote:
>
>> you can write a simple tool to move blocks peer to peer. I had such tool
>> before, but I cannot find it now.
>>
>> background: our cluster is not balanced, load balancer is very slow, so i
>> wrote this tool to move blocks from one node to another node.
>>
>>
>> On Wed, Mar 5, 2014 at 4:06 PM, divye sheth <divs.sheth@gmail.com> wrote:
>>
>>> I wont be in a position to fix that depending on HDFS-1804 as we are
>>> upgrading to CDH4 in the coming month. Just wanted a short term solution. I
>>> have read somewhere that manual movement of the blocks would help. Could
>>> some one guide me to the exact steps or precautions I should take while
>>> doing this? Data loss is a NO NO for me.
>>>
>>> Thanks
>>> Divye Sheth
>>>
>>>
>>> On Wed, Mar 5, 2014 at 1:28 PM, Azuryy Yu <azuryyyu@gmail.com> wrote:
>>>
>>>> Hi,
>>>> That probably break something if you apply the patch from 2.x to
>>>> 0.20.x, but it depends on.
>>>>
>>>> AFAIK, Balancer had a major refactor in HDFSv2, so you'd better fix it
>>>> by yourself based on HDFS-1804.
>>>>
>>>>
>>>>
>>>> On Wed, Mar 5, 2014 at 3:47 PM, divye sheth <divs.sheth@gmail.com>wrote:
>>>>
>>>>> Thanks Harsh. The jira is fixed in version 2.1.0 whereas I am using
>>>>> Hadoop 0.20.2 (we are in a process of upgrading) is there a workaround
for
>>>>> the short term to balance the disk utilization? The patch in the Jira,
if
>>>>> applied to the version that I am using, will it break anything?
>>>>>
>>>>> Thanks
>>>>> Divye Sheth
>>>>>
>>>>>
>>>>> On Wed, Mar 5, 2014 at 11:28 AM, Harsh J <harsh@cloudera.com> wrote:
>>>>>
>>>>>> You're probably looking for
>>>>>> https://issues.apache.org/jira/browse/HDFS-1804
>>>>>>
>>>>>> On Tue, Mar 4, 2014 at 5:54 AM, divye sheth <divs.sheth@gmail.com>
>>>>>> wrote:
>>>>>> > Hi,
>>>>>> >
>>>>>> > I am new to the mailing list.
>>>>>> >
>>>>>> > I am using Hadoop 0.20.2 with an append r1056497 version. The
>>>>>> question I
>>>>>> > have is related to balancing. I have a 5 datanode cluster and
each
>>>>>> node has
>>>>>> > 2 disks attached to it. The second disk was added when the first
>>>>>> disk was
>>>>>> > reaching its capacity.
>>>>>> >
>>>>>> > Now the scenario that I am facing is, when the new disk was
added
>>>>>> hadoop
>>>>>> > automatically moved over some data to the new disk. But over
the
>>>>>> time I
>>>>>> > notice that data is no longer being written to the second disk.
I
>>>>>> have also
>>>>>> > faced an issue on the datanode where the first disk had 100%
>>>>>> utilization.
>>>>>> >
>>>>>> > How can I overcome such scenario, is it not hadoop's job to
balance
>>>>>> the disk
>>>>>> > utilization between multiple disks on single datanode?
>>>>>> >
>>>>>> > Thanks
>>>>>> > Divye Sheth
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Harsh J
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message