hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rakesh R (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-10460) Erasure Coding: Recompute block checksum for a particular range less than file size on the fly by reconstructing missed block
Date Sun, 19 Jun 2016 16:34:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15338600#comment-15338600
] 

Rakesh R edited comment on HDFS-10460 at 6/19/16 4:33 PM:
----------------------------------------------------------

Thanks [~drankye] for the review comments.

bq. 1. Could you explain why we need to add actualNumBytes for this, or ellaborate some bit
in the description for better understanding
I've used the {{actualNumBytes}} parameter for reconstructing the block correctly. Initially
I have tried {{requestLength}} value for reconstructing the block to avoid following exception.
IIUC this could occur in cases where the requested length is conflicting with the target buffer
size. Probably you can reproduce this exception by commenting out setting of acutalNumBytes
value after applying my patch and run {{TestFileChecksum#testStripedFileChecksumWithMissedDataBlocksRangeQuery1}}
{code}
BlockChecksumHelper.java
line no#481

      ExtendedBlock reconBlockGroup = new ExtendedBlock(blockGroup);
      // reconBlockGroup.setNumBytes(actualNumBytes);
{code}
{code}
2016-06-19 21:37:34,583 [DataXceiver for client /127.0.0.1:5882 [Getting checksum for block
groupBP-1490511527-10.252.155.196-1466352430600:blk_-9223372036854775792_1001]] ERROR datanode.DataNode
(DataXceiver.java:run(316)) - 127.0.0.1:5333:DataXceiver error processing BLOCK_GROUP_CHECKSUM
operation  src: /127.0.0.1:5882 dst: /127.0.0.1:5333
org.apache.hadoop.HadoopIllegalArgumentException: No enough valid inputs are provided, not
recoverable
	at org.apache.hadoop.io.erasurecode.rawcoder.ByteBufferDecodingState.checkInputBuffers(ByteBufferDecodingState.java:107)
{code}

I have done the following approach to handle requestLength less than cellSize. First, using
the actualNumBytes it will allow to reconstruct the buffers and then take a copy of the targetbuffer
using the remaining length. Then using this copied buffer will calculate the checksum.
{code}
StripedBlockChecksumReconstructor.java
line no#93

if (requestedLen <= toReconstructLen) {
        int remainingLen = (int) requestedLen;
        outputData = Arrays.copyOf(targetBuffer.array(), remainingLen);
{code}


bq. 1) you mean less than bytesPerCRC, but in fact you passed bytesPerCRC as the request length.
2) you could get bytesPerCRC and save it in setup method? So you can use it in other tests.
Yes, I will do this modifications in next patch.


was (Author: rakeshr):
Thanks [~drankye] for the review comments.

bq. 1. Could you explain why we need to add actualNumBytes for this, or ellaborate some bit
in the description for better understanding
I've used the {{actualNumBytes}} parameter for reconstructing the block correctly. Initially
I have tried {{requestLength}} value for reconstructing the block to avoid following exception.
IIUC this could occur in cases where the requested length is conflicting with the target buffer
size. Probably you can reproduce this exception by commenting out setting of acutalNumBytes
value after applying my patch and run {{TestFileChecksum#testStripedFileChecksumWithMissedDataBlocksRangeQuery1}}
{code}
BlockChecksumHelper.java
line no#481

      ExtendedBlock reconBlockGroup = new ExtendedBlock(blockGroup);
      // reconBlockGroup.setNumBytes(actualNumBytes);
{code}
{code}
2016-06-19 21:37:34,583 [DataXceiver for client /127.0.0.1:5882 [Getting checksum for block
groupBP-1490511527-10.252.155.196-1466352430600:blk_-9223372036854775792_1001]] ERROR datanode.DataNode
(DataXceiver.java:run(316)) - 127.0.0.1:5333:DataXceiver error processing BLOCK_GROUP_CHECKSUM
operation  src: /127.0.0.1:5882 dst: /127.0.0.1:5333
org.apache.hadoop.HadoopIllegalArgumentException: No enough valid inputs are provided, not
recoverable
	at org.apache.hadoop.io.erasurecode.rawcoder.ByteBufferDecodingState.checkInputBuffers(ByteBufferDecodingState.java:107)
{code}

bq. 1) you mean less than bytesPerCRC, but in fact you passed bytesPerCRC as the request length.
2) you could get bytesPerCRC and save it in setup method? So you can use it in other tests.
Yes, I will do this modifications in next patch.

> Erasure Coding: Recompute block checksum for a particular range less than file size on
the fly by reconstructing missed block
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10460
>                 URL: https://issues.apache.org/jira/browse/HDFS-10460
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: HDFS-10460-00.patch, HDFS-10460-01.patch
>
>
> This jira is HDFS-9833 follow-on task to address reconstructing block and then recalculating
block checksum for a particular range query.
> For example,
> {code}
> // create a file 'stripedFile1' with fileSize = cellSize * numDataBlocks = 65536 * 6
= 393216
> FileChecksum stripedFileChecksum = getFileChecksum(stripedFile1, 10, true);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message