hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Walter Su (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-8501) Erasure Coding: Improve memory efficiency of BlockInfoStriped
Date Sat, 30 May 2015 08:42:17 GMT
Walter Su created HDFS-8501:
-------------------------------

             Summary: Erasure Coding: Improve memory efficiency of BlockInfoStriped
                 Key: HDFS-8501
                 URL: https://issues.apache.org/jira/browse/HDFS-8501
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: Walter Su
            Assignee: Walter Su


Erasure Coding: Improve memory efficiency of BlockInfoStriped

Assume we have a BlockInfoStriped:
{noformat}
triplets[] = {s0, s1, s2, s3}
indices[] = {0, 1, 2, 3}
{noformat}

When we run balancer/mover to re-locate replica on s2, firstly it becomes:
{noformat}
triplets[] = {s0, s1, s2, s3, s2}
indices[] = {0, 1, 2, 3, 2}
{noformat}
Then the replica on s1 is removed, finally it becomes:
{noformat}
triplets[] = {s0, s1, null, s3, s2}
indices[] = {0, 1, -1, 3, 2}
{noformat}

The worst case is:
{noformat}
triplets[] = {null, null, null, null, s0, s1, s2, s3}
indices[] = {-1, -1, -1, -1, 0, 1, 2, 3}
{noformat}


We should learn from {{BlockInfoContiguous.removeStorage(..)}}. When a storage is removed,
we bring the last item front.
With the improvement, the worst case become:
{noformat}
triplets[] = {s0, s1, s2, s3, null}
indices[] = {0, 1, 2, 3, -1}
{noformat}
We have an empty slot.

Notes:
Assume we copy 4 storage first, then delete 4. Even with the improvement, the worst case could
be:
{noformat}
triplets[] = {s0, s1, s2, s3, null, null, null, null}
indices[] = {0, 1, 2, 3, -1, -1, -1, -1}
{noformat}
But the Balancer strategy won't move same block/blockGroup twice in a row. So this case is
very rare.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message