hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDDS-676) Enable Read from open Containers via Standalone Protocol
Date Fri, 19 Oct 2018 21:19:00 GMT

    [ https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657454#comment-16657454

Anu Engineer commented on HDDS-676:

1. message GetBlockRequestProto {
required DatanodeBlockID blockID = 1;
*required* uint64 blockCommitSequenceId = 2;

A question rather than a comment. If we make this a required field, and if get to a datanode
and want to read a block -- say a case where we say give any version of data you have, how
does that work? Should we make this optional? This also means that if you have written a block
using 0.2.1, then we cannot read that via this protocol. It is an Alpha release, so we don't
have to strictly be backward compactable; just a thought.

2. More a question of code style: should we have bcsId inside blockID or no? In the blockManager

3. BlockManagerImpl.java#putBlock 
long blockCommitSequenceId = data.getBlockCommitSequenceId();
long blockCommitSequenceIdValue = getBlockCommitSequenceId(db);

My apologies, but this code gives me a headache. I understand the context and the why, But
a function with variables *blockCommitSequenceId*, *blockCommitSequenceIdValue* and *blockCommitSequenceIdKey*
causes a stack overflow for me :(

4. BlockManagerImpl.java#getBlock:Line 155
I am asking more to make sure that I understand this correctly. We look up the max container
bcsid, and then later read the bcsid. I am guessing we do this because we are counting on
containerBCSId will be cached and it will not cause a real disk I/O and therefore checking
it makes it easier for us to fail faster?

5. ContainerProtcolCalls.java#readSmallFile:Line 394
// by default, set the bcsId to be 0

This does not match with the earlier logic in putBlock. I am presuming the putSmallFile call
was committed via Ratis. Please correct me if I am wrong.

// default blockCommitSequenceId for any block is 0. It the putBlock
// request is not coming via Ratis(for test scenarios), it will be 0.
// In such cases, we should overwrite the block as well
if (blockCommitSequenceIdValue != null && blockCommitSequenceId != 0) {
if (blockCommitSequenceId <= Longs
.fromByteArray(blockCommitSequenceIdValue)) {
if (blockCommitSequenceId != 0) {
if (blockCommitSequenceId <= blockCommitSequenceIdValue) {
// Since the blockCommitSequenceId stored in the db is greater than
// equal to blockCommitSequenceId to be updated, it means the putBlock
// transaction is reapplied in the ContainerStateMachine on restart.
// It also implies that the given block must already exist in the db.
// just log and return
LOG.warn("blockCommitSequenceId " + Longs
LOG.warn("blockCommitSequenceId " + blockCommitSequenceIdValue
+ " in the Container Db is greater than" + " the supplied value "
+ blockCommitSequenceId + " .Ignoring it");
return data.getSize();

6. XceieverClientGrpc.java#sendCommand Can we please separate out the trying for different
datanodes vs. connecting and reading data. That make this into 2 functions. One that connects
and reads, another which tries all nodes till we get a successful read.

There is also an issue that is not taken care in this code path. What happens if a block is
deleted? If the datanode says I don't have this block, do we still try all 3 data nodes? I
am presuming we don't have to deal with it since it is an edge case.

7. nit: The fact that we have to maintain dnIndex and Size might be a good indication that
we need old school for-loop at line 154 instead of a for-each loop.

8. Nit: XceiverClientRatis#watchForCommit - if this is a test only function perhaps have in
a test helper file?

> Enable Read from open Containers via Standalone Protocol
> --------------------------------------------------------
>                 Key: HDDS-676
>                 URL: https://issues.apache.org/jira/browse/HDDS-676
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Shashikant Banerjee
>            Assignee: Shashikant Banerjee
>            Priority: Major
>         Attachments: HDDS-676.001.patch, HDDS-676.002.patch, HDDS-676.003.patch, HDDS-676.004.patch
> With BlockCommitSequenceId getting updated per block commit on open containers in OM
as well datanode, Ozone Client reads can through Standalone protocol not necessarily requiring
Ratis. Client should verify the BCSID of the container which has the data block , which should
always be greater than or equal to the BCSID of the block to be read and the existing block
BCSID should exactly match that of the block to be read. As a part of this, Client can try
to read from a replica with a supplied BCSID and failover to the next one in case the block
does ont exist on one replica.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message