hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinayakumar B (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8498) Blocks can be committed with wrong size
Date Wed, 03 Aug 2016 14:15:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15405968#comment-15405968
] 

Vinayakumar B commented on HDFS-8498:
-------------------------------------

We experienced this in one of our testing cluster under high load.

*Scenario:*
Error occured for the HBase-RegionServer's WAL file.
1. In HBase there will be multiple threads performing the write,sync and close of same WAL
file.
2. Actual writer writes the entries, multiple syncers call hsync on same stream and A roller
thread rolls the WALs in regular intervals. i.e. close the curren WAL file and open another
one for next entries.
3. During file close() by roller, last block got committed with less size, than present in
all DNs.
4. All IBRs reported by DNs have more length, than that of COMMITTED length by the client.
So all those replicas are marked as CORRUPT.
5. We use IBR batch with {{dfs.namenode.file.close.num-committed-allowed=1}}. So Client(HBase-RS)
did not experience any problem, as file got closed successfully without waiting for the Correct
IBR for last block.

*Current Analysis:*

HDFS-9289, safegaurded the {{DataStreamer#block}}'s re-assignment during pipeline update by
making it volatile. But it did not actually protected the contents of the {{block}}.

*Suspected problem is:*
1. ResponseProcessor updated the block size by updating the numBytes after receiving every
Ack by calling {{ExtendedBlock.setNumBytes()}}, which internally updates the numBytes of internal
{{block}} which is not thread safe.
2. LogRoller calls close by by passing {{DataStreamer#block}} as last block. During this time,
GUESS is that {{ExtendedBlock.getNumBytes()}} is not returning the latest value updated by
ReponseProcessor, instead returning some of the earlier update. Because ExtendedBlock and
its internal block is not threadsafe.
By this lesser size, Block is getting COMMITTED at NameNode and all IBRs are getting marked
as CORRUPT.

*Possible solution:*
Make the ExtendedBlock threadsafe for setNumBytes() and getNumBytes().

If the above analysis makes sense, then we can raise one Jira and contribute the fix.

Note:
This issue we got in 40-core/380GB-RAM machine thrice. Trying to reproduce again with more
logs, but no luck till now.
Once it was reproduced with DEBUG logs as well, from that its confirmed that complete() call
is sent only after receiving all ACKs. But DEBUG logs was having no information of numBytes
sent during complete(). So could not actually verify that this would be the fix.

> Blocks can be committed with wrong size
> ---------------------------------------
>
>                 Key: HDFS-8498
>                 URL: https://issues.apache.org/jira/browse/HDFS-8498
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.5.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>
> When an IBR for a UC block arrives, the NN updates the expected location's block and
replica state _only_ if it's on an unexpected storage for an expected DN.  If it's for an
expected storage, only the genstamp is updated.  When the block is committed, and the expected
locations are verified, only the genstamp is checked.  The size is not checked but it wasn't
updated in the expected locations anyway.
> A faulty client may misreport the size when committing the block.  The block is effectively
corrupted.  If the NN issues replications, the received IBR is considered corrupt, the NN
invalidates the block, immediately issues another replication.  The NN eventually realizes
all the original replicas are corrupt after full BRs are received from the original DNs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message