hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinayakumar B (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10625) VolumeScanner to report why a block is found bad
Date Thu, 28 Jul 2016 05:46:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15396985#comment-15396985
] 

Vinayakumar B commented on HDFS-10625:
--------------------------------------

bq. We can add a catch block here to catch the IOException thrown, then include the replica
information and throw a new IO exception, e.g:
One problem here, is for the places which expects Specific exception such as {{ChecksumException}}
or {{FileNotFoundException}}, they get IOException with cause set as ChecksumException or
FNFE.
So its better to not to change in this. Let original IOException thrown back. Anyway DN logs
will be there to catch the replica details.

bq. Looks like we can make this replica a member of BlockSender instead of a local variable
here, so that we can refer to it when needed, such as for this jira. We probably should make
replicaVisibleLength a member and report it as part of the replica info too, since when the
writing is going on, this value may be changing concurrently.
Making ReplicaInfo a member is good, but making {{replicaVisibleLength}} a member may not
be required. Because already {{endOffSet}} will be present which can decide how much BlockSender
intended to read. So whenever required {{endOffset}} can be used.
Coming to checksum verfication, BlockSender will do checkSum verification for only finalized
blocks via VolumeScanner. Not while reading(Reading case verification happens at the client).
So we can expect replica can be finalized in this case and no change in the visibleLength.

So I feel, for the latest patch change required is, combining HDFS-10626, making replicaInfo
a member and using to construct checksumException message.

>  VolumeScanner to report why a block is found bad
> -------------------------------------------------
>
>                 Key: HDFS-10625
>                 URL: https://issues.apache.org/jira/browse/HDFS-10625
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, hdfs
>            Reporter: Yongjun Zhang
>            Assignee: Rushabh S Shah
>              Labels: supportability
>         Attachments: HDFS-10625-1.patch, HDFS-10625.patch
>
>
> VolumeScanner may report:
> {code}
> WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad blk_1170125248_96458336
on /d/dfs/dn
> {code}
> It would be helpful to report the reason why the block is bad, especially when the block
is corrupt, where is the first corrupted chunk in the block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message