hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Giridhar Addepalli <giridhar1...@gmail.com>
Subject Doubt Regarding QJM protocol - example 2.10.6 of Quorum-Journal Design document
Date Sun, 28 Sep 2014 09:17:53 GMT
Hi All,

I am going through Quorum Journal Design document.

It is mentioned in Section 2.8 - In Accept Recovery RPC section
If the current on-disk log is missing, or a *different length *than the
proposed recovery, the JN downloads the log from the provided URI,
replacing any current copy of the log segment.

I can see it that the code follows above design

Source :: Journal.java

  public synchronized void acceptRecovery(RequestInfo reqInfo,
      SegmentStateProto segment, URL fromUrl)
      throws IOException {

      if (currentSegment == null ||
        currentSegment.getEndTxId() != segment.getEndTxId()) {
      } else {
      LOG.info("Skipping download of log " +
          TextFormat.shortDebugString(segment) +
          ": already have up-to-date logs");

My question is what if on-disk log is present and is of *same length *as
the proposed recovery

If JournalNode is skipping download because the logs are of same length,
then we could end up in a situation where finalized log segments contain
different data !

This could happen if we follow example 2.10.6

As per that example we write transactions (151-153 ) on JN1
then when recovery proceeded with only JN2 & JN3 let us assume that we
write again *different transactions* as (151-153) . Then after the crash
when we run recovery , JN1 will skip downloading correct segment from
JN2/JN3 as it thinks it has correct segment( as per the code pasted above).
This will result in a situation where finalized segment ( edits_151-153 )
on JN1 is different from finalized segment edits_151-153 on JN2/JN3.

Please let me know if i have gone wrong some where, and this situation is
taken care of.


View raw message