hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "B. X." <bxi...@gmail.com>
Subject Re: datanode ack behavior for block receive
Date Wed, 04 Nov 2009 20:36:47 GMT
I guess you are right.  It doesn't affect correctness, or even performance. The
only bits I am not sure about performance is can we safely assume all
the status
messages and seqno ack messages will travel in one TCP packets?  Given that
there probably won't being long chains of datanodes, the total ack
message sizes
should be well under system buffer size, this should be true.  (Here I
assume that
even though datanodes use non-blocking socketchannel writes/reads, each
individual socketchannel.write are still buffered at system level
before they go out
to the network).


From: Dhruba Borthakur <dhruba@gmail.com>
To: common-dev@hadoop.apache.org
Date: Mon, 2 Nov 2009 22:31:03 -0800
Subject: Re: datanode ack behavior for block receive
Hi Bin,

I think that your observation is correct. The  act of sending a SUCCESS
status ack can be avoided by intelligently looking at the seqno. However, my
opinion is that returning the extra bit of information is not impacting
performance/correctness at all, do you agree?


On Mon, Nov 2, 2009 at 12:39 PM, B. X. <bxin33@gmail.com> wrote:

> Hi All,
>  I observed that there are two kinds of ack'ing going on when a
> datanode receives a data block packet: 1. ack by sending the sequence
> number of the received block to upstream datanode; 2. also send
> operation status (e.g. SUCCESS, ERROR);
>  The seqno is chained, that is, a node will not ack the seqno unless
> it received the same seqno from downstream, or a -2 is sent to
> indicate not receiving anything from downstream datanodes.
> The status is forwarded, with the number of such messages increased by
> one traveling upstream.
>   My question is why the seqno ack mechanism alone is not sufficient
> in this case.  Are status acks really needed?
> -Bin

View raw message