hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ulul <had...@ulul.org>
Subject Re: Doubt Regarding QJM protocol - example 2.10.6 of Quorum-Journal Design document
Date Sun, 28 Sep 2014 21:09:31 GMT
Hi

A developer should answer that but a quick look to an edit file with od 
suggests that record are not fixed length. So maybe the likeliness of 
the situation you suggest is so low that there is no need to check more 
than file size

Ulul

Le 28/09/2014 11:17, Giridhar Addepalli a écrit :
> Hi All,
>
> I am going through Quorum Journal Design document.
>
> It is mentioned in Section 2.8 - In Accept Recovery RPC section
> "
> If the current on-disk log is missing, or a /different length /than 
> the proposed recovery, the JN downloads the log from the provided URI, 
> replacing any current copy of the log segment.
> "
>
> I can see it that the code follows above design
>
> Source :: Journal.java
>              ....
>
>       public synchronized void acceptRecovery(RequestInfo reqInfo,
>           SegmentStateProto segment, URL fromUrl)
>           throws IOException {
>
>           ....
>           if (currentSegment == null ||
>             currentSegment.getEndTxId() != segment.getEndTxId()) {
>           ....
>           } else {
>           LOG.info("Skipping download of log " +
>               TextFormat.shortDebugString(segment) +
>               ": already have up-to-date logs");
>           }
>           ....
>       }
>     ....
>
> My question is what if on-disk log is present and is of /same length 
> /as the proposed recovery
>
> If JournalNode is skipping download because the logs are of same 
> length, then we could end up in a situation where finalized log 
> segments contain different data !
>
> This could happen if we follow example 2.10.6
>
> As per that example we write transactions (151-153 ) on JN1
> then when recovery proceeded with only JN2 & JN3 let us assume that we 
> write again /different transactions/ as (151-153) . Then after the 
> crash when we run recovery , JN1 will skip downloading correct segment 
> from JN2/JN3 as it thinks it has correct segment( as per the code 
> pasted above). This will result in a situation where finalized segment 
> ( edits_151-153 ) on JN1 is different from finalized segment 
> edits_151-153 on JN2/JN3.
>
> Please let me know if i have gone wrong some where, and this situation 
> is taken care of.
>
> Thanks,
> Giridhar.


Mime
View raw message