hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7930) commitBlockSynchronization() does not remove locations, which were not confirmed
Date Fri, 13 Mar 2015 19:28:38 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360974#comment-14360974

Konstantin Shvachko commented on HDFS-7930:

I saw it with truncate in HDFS-7886, [described in this comment|https://issues.apache.org/jira/browse/HDFS-7886?focusedCommentId=14360903&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14360903].
But it can also happen with a regular block recovery, particularly when DNs are restarting
during the recovery.
If recovery is started for a UC-block, which already has locations from live DNs, then recovery
may succeed only for some of those locations, because the others have e.g. a different length.
But {{commitBlockSynchronization()}} will not remove the unconfirmed locations. The locations
will be invalidated by the next block report and then replicated correctly, but until then
reads may see different data or fail.
Will post a failing test once HDFS-7886 is in.
Marked it is a blocker for 2.7.0. Feel free to unmark if it is not.

> commitBlockSynchronization() does not remove locations, which were not confirmed
> --------------------------------------------------------------------------------
>                 Key: HDFS-7930
>                 URL: https://issues.apache.org/jira/browse/HDFS-7930
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.7.0
>            Reporter: Konstantin Shvachko
>            Priority: Blocker
> When {{commitBlockSynchronization()}} has less {{newTargets}} than in the original block
it does not remove unconfirmed locations. This results in that the the block stores locations
of different lengths or genStamp (corrupt).

This message was sent by Atlassian JIRA

View raw message