hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 冯宏华 <fenghong...@xiaomi.com>
Subject Two kinds of data inconsistency
Date Tue, 27 May 2014 10:16:34 GMT
Not sure whether there already had similar discussion on it, sorry for re-raising if yes.

1. data inconsistency between master and peer clusters:
  a). write a keyvalue KV1 at a specific coordinate (row, cf, col, ts) with value V1 to master
cluster A, and since there is active scanner while flushing, KV1's memstoreTS not set to 0
in the resultant hfile F1
  b). write KV1 once again with the same coordinate (row, cf, col, ts) but with different
value V2, and no active scanner while flushing this time, KV1's memstoreTS is set to 0 in
the resultant hfile F2
  c). two KV1 are replicated to peer cluster serially, no active scanner when flushing and
they are flushed to two different hfiles both with memstoreTS=0

  now, a client reads KV1 from the master cluster will find the value is V1 (since its memstoreTS
is larger), and when it reads KV1 from peer cluster will find the value is V2 (since memstoreTS
are equal but the latter's seqID is larger)

2. data inconsistency in different time phases:
   a). write a keyvalue KV1 at a specific coordinate (row, cf, col, ts) with value V1 to master
cluster A, and since there is active scanner while flushing, KV1's memstoreTS is not set to
0 in the resultant hfile F1
  b). write KV1 once again with the same coordinate (row, cf, col, ts) but with different
value V2, and no active scanner while flushing this time, KV1's memstoreTS is set to 0 in
the resultant hfile F2

  reading KV1 now will find the value is V1 (since its memstoreTS is larger)

  c). after a while a compact including F1(but not F2) occurs and KV1's memstoreTS is set
to 0 since no active scanner

  reading KV1 now will find the value is V2 (since memstoreTS are equal but the latter's seqID
is larger)

Keeping mvcc untouched during a keyvalue's whole lifecycle (during flush/compact, or failover/HLog-replay)
can avoid above two kinds of data inconsistency, any opinion?
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message