hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Chen (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-1831) BlockPlacement policy for RAID
Date Thu, 09 Dec 2010 01:10:04 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Scott Chen updated MAPREDUCE-1831:
----------------------------------

    Description: 
Raid introduce the new dependency between blocks within a file.
The blocks help decode each other. Therefore we should avoid put them on the same machine.

The proposed BlockPlacementPolicy does the following
1. When writing parity blocks, it avoid the parity blocks and source blocks sit together.
2. When reducing replication number, it deletes the blocks that sits with other dependent
blocks.
3. It does not change the way we write normal files. It only has different behavior when processing
raid files.

  was:
In raid, it is good to have the blocks on the same stripe located on different machine.
This way when one machine is down, it does not broke two blocks on the stripe.
By doing this, we can decrease the block error probability in raid from O(p^3) to O(p^4) which
can be a hugh improvement (where p is the replica missing probability).

One way to do this is that we can add a new BlockPlacementPolicy which deletes the replicas
that are co-located.
So when raiding the file, we can make the remaining replicas live on different machines.

        Summary: BlockPlacement policy for RAID  (was: Delete the co-located replicas when
raiding file)

> BlockPlacement policy for RAID
> ------------------------------
>
>                 Key: MAPREDUCE-1831
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1831
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/raid
>    Affects Versions: 0.22.0
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>             Fix For: 0.22.0
>
>         Attachments: MAPREDUCE-1831.20100610.txt, MAPREDUCE-1831.txt, MAPREDUCE-1831.v1.1.txt
>
>
> Raid introduce the new dependency between blocks within a file.
> The blocks help decode each other. Therefore we should avoid put them on the same machine.
> The proposed BlockPlacementPolicy does the following
> 1. When writing parity blocks, it avoid the parity blocks and source blocks sit together.
> 2. When reducing replication number, it deletes the blocks that sits with other dependent
blocks.
> 3. It does not change the way we write normal files. It only has different behavior when
processing raid files.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message