kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ananth G <ananthg.a...@gmail.com>
Subject Time travel reads in Kudu
Date Sat, 17 Jun 2017 20:40:27 GMT
Hello All,

I was wondering if the following is possible as a time travel read in Kudu. 

Assuming T stands for the timestamp at which the record has been committed, I have one insert
for a given row @T1 followed by 3 updates at time stamps @T2,@T3 and @T4. Finally the row
was deleted at @T5.  ( T1 < T2 < T3 < T4 < T5 in terms of timestamps). Representing
these values of this row as V, the following is the state of values of this row. 

T1 -> V1 ( original insert )
T2 -> V2 ( first update )
T3 -> V3 ( second update ) 
T4 -> V4 ( third update )
T5 -> V5 ( Tombstone/delete ) 

Now I want to perform a read scan. I am using the READ_AT_SNAPSHOT mode and using setSnapShotMicros()
 method to perform the read at that snapshot. I was wondering if I would have the flexibility
to get the following values provided I am using the snapshot times as follows : 

1. Can I get value V2 if I set snapshot time as t2 provided T2< t2 < T3 ? 
2. Can I get value V3 if I set snapshot time as t3 provided T3 < t3 <  T4 ? 

Also it is obvious for this to work properly  we will need two timestamps as part of the API
call ( lower and upper bound ) to retrieve value V2.  The usage of the word MVCC is interesting
and hence this question. 

In other words, when we say Kudu has a MVCC style for data as an asset; is it for all versions
of the data mutation or just for the reconciliation stage ? I am assuming it is only for the
last stage of reconciliation ( i.e. until reads are fully committed ). Since timestamps in
Kudu seem to be for the lower bound markers, the above might not be possible but wanted to
check with the community. 

If it is otherwise , does the model hold good after a compaction is performed ? 

View raw message