hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhe Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-8453) Erasure coding: properly assign start offset for internal blocks in a block group
Date Tue, 26 May 2015 03:57:17 GMT

     [ https://issues.apache.org/jira/browse/HDFS-8453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhe Zhang updated HDFS-8453:
----------------------------
    Status: Patch Available  (was: In Progress)

Actually it's not possible to assign meaningful start offset values for all internal blocks,
especially parity ones. Consider a block group with 1 byte of data. No matter how to set the
start offsets for parity blocks (negative values, etc.), they will overlap with the next block
group in the file. 

So this patch takes another approach: refactor {{DFSInputStream}} with a new {{refreshLocatedBlock}}
method when the located block is to be refreshed instead of calling {{getBlockAt}} at first
time. Then the refresh method can be extended in {{DFSStripedInputStream}} with index handling.

> Erasure coding: properly assign start offset for internal blocks in a block group
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-8453
>                 URL: https://issues.apache.org/jira/browse/HDFS-8453
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8453-HDFS-7285.00.patch
>
>
> {{LocatedBlock#offset}} should indicate the "offset of the first byte of the block in
the file". In a striped block group, we should properly assign this {{offset}} for internal
blocks, so each internal block can be identified from a given offset.
> My current plan is to keep using {{bg.getStartOffset() + idxInBlockGroup * cellSize}}
as the start offset for data blocks. For parity blocks, use {{-1 * (bg.getStartOffset() +
idxInBlockGroup * cellSize)}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message