hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HDFS-1467) TestPipelines failing on trunk
Date Tue, 16 Nov 2010 01:21:15 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Todd Lipcon updated HDFS-1467:

    Attachment: hdfs-1467.txt

It turns out there's a really big bug in pipeline construction for append... I'm not sure
why more tests aren't failing more often!

Currently in BlockReceiver's constructor, we copy the reference to the provided block, and
increment its generation stamp to the {{newGs}}. But, this is just a reference to the same
Block object used in DataXceiver.opWriteBlock. So, when the DN tries to set up the append
pipeline to its downstream mirror, it ends up passing the new generation stamp version of
the block, and the downstream mirror will fail to construct a pipeline.

The end result is that we can never successfully construct a pipeline for append with more
than one datanode in it! This test would fail about 2/3 of the time since it would see the
old replica on one of the nodes that didn't make it into the append pipeline.

This patch fixees the BlockReceiver constructor to take a copy of the block, and TestPipelines
seems to pass now. I also added some extra debug logging just to illustrate the problem.

This probably should not be committed until we can add another test which shows the problem
100% of the time.

> TestPipelines failing on trunk
> ------------------------------
>                 Key: HDFS-1467
>                 URL: https://issues.apache.org/jira/browse/HDFS-1467
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Priority: Critical
>         Attachments: failed-TestPipelines.txt, hdfs-1467.txt
> TestPipelines appears to be failing on trunk:
> Should be RBW replica after sequence of calls append()/write()/hflush() expected:<RBW>
but was:<FINALIZED>
> junit.framework.AssertionFailedError: Should be RBW replica after sequence of calls append()/write()/hflush()
expected:<RBW> but was:<FINALIZED>
>         at org.apache.hadoop.hdfs.TestPipelines.pipeline_01(TestPipelines.java:109)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message