hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Two kinds of data inconsistency
Date Tue, 27 May 2014 16:57:19 GMT
Unification of mvcc and seqId would pave way for solving such
inconsistencies.
See HBASE-8763

Cheers


On Tue, May 27, 2014 at 3:16 AM, 冯宏华 <fenghonghua@xiaomi.com> wrote:

> Not sure whether there already had similar discussion on it, sorry for
> re-raising if yes.
>
> 1. data inconsistency between master and peer clusters:
>   a). write a keyvalue KV1 at a specific coordinate (row, cf, col, ts)
> with value V1 to master cluster A, and since there is active scanner while
> flushing, KV1's memstoreTS not set to 0 in the resultant hfile F1
>   b). write KV1 once again with the same coordinate (row, cf, col, ts) but
> with different value V2, and no active scanner while flushing this time,
> KV1's memstoreTS is set to 0 in the resultant hfile F2
>   c). two KV1 are replicated to peer cluster serially, no active scanner
> when flushing and they are flushed to two different hfiles both with
> memstoreTS=0
>
>   now, a client reads KV1 from the master cluster will find the value is
> V1 (since its memstoreTS is larger), and when it reads KV1 from peer
> cluster will find the value is V2 (since memstoreTS are equal but the
> latter's seqID is larger)
>
> 2. data inconsistency in different time phases:
>    a). write a keyvalue KV1 at a specific coordinate (row, cf, col, ts)
> with value V1 to master cluster A, and since there is active scanner while
> flushing, KV1's memstoreTS is not set to 0 in the resultant hfile F1
>   b). write KV1 once again with the same coordinate (row, cf, col, ts) but
> with different value V2, and no active scanner while flushing this time,
> KV1's memstoreTS is set to 0 in the resultant hfile F2
>
>   reading KV1 now will find the value is V1 (since its memstoreTS is
> larger)
>
>   c). after a while a compact including F1(but not F2) occurs and KV1's
> memstoreTS is set to 0 since no active scanner
>
>   reading KV1 now will find the value is V2 (since memstoreTS are equal
> but the latter's seqID is larger)
>
> Keeping mvcc untouched during a keyvalue's whole lifecycle (during
> flush/compact, or failover/HLog-replay) can avoid above two kinds of data
> inconsistency, any opinion?
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message