kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ananth Gundabattula <agundabatt...@gmail.com>
Subject Re: Time travel reads in Kudu
Date Mon, 19 Jun 2017 10:49:07 GMT
Thanks a lot Todd. The gist clarifies the point. 

I think this is a really powerful feature of Kudu that is being undersold :). Some very interesting
data analytics can be performed with these features using kudu store !! As an examplehttp://pachyderm.io/pfs.html
<examplehttp://pachyderm.io/pfs.html>  kind of use cases are now enabled by kudu as
well. 

Thanks for clarification. Hoping to integrate this feature into Apache Apex read scanner mechanisms
soon. 

Regards,
Ananth 
> On 19 Jun 2017, at 7:36 am, Todd Lipcon <todd@cloudera.com> wrote:
> 
> Just to illustrate, I wrote a quick python script that shows the behavior: https://gist.github.com/toddlipcon/385fcf4211f83e4968be3401db3147ba
<https://gist.github.com/toddlipcon/385fcf4211f83e4968be3401db3147ba>
> 
> The script runs your scenario of insert, update, update, delete, and then scans at each
of the times between the operations. The output on my machine (running against a local tserver)
is:
> scan at datetime.datetime(2017, 6, 18, 21, 35, 20, 594427): [(1, 'v1')]
> scan at datetime.datetime(2017, 6, 18, 21, 35, 20, 595743): [(1, 'v2')]
> scan at datetime.datetime(2017, 6, 18, 21, 35, 20, 597093): [(1, 'v3')]
> scan at datetime.datetime(2017, 6, 18, 21, 35, 20, 598470): []
> 
> Note that this example script is relying on the local clock instead of the propagated
timestamps, so it might not work correctly against a cluster (the server side may have clock
skew relative to the local machine where the script is running). If you need it to work including
clock skew, you'll have to use the more advanced APIs to retrieve propagated timestamps from
the server side after each write.
> 
> -Todd
> 
> 
> On Sun, Jun 18, 2017 at 1:36 PM, Todd Lipcon <todd@cloudera.com <mailto:todd@cloudera.com>>
wrote:
> Hi Ananth,
> 
> Answers inline below
> 
> On Sat, Jun 17, 2017 at 1:40 PM, Ananth G <ananthg.apex@gmail.com <mailto:ananthg.apex@gmail.com>>
wrote:
> Hello All,
> 
> I was wondering if the following is possible as a time travel read in Kudu.
> 
> Assuming T stands for the timestamp at which the record has been committed, I have one
insert for a given row @T1 followed by 3 updates at time stamps @T2,@T3 and @T4. Finally the
row was deleted at @T5.  ( T1 < T2 < T3 < T4 < T5 in terms of timestamps). Representing
these values of this row as V, the following is the state of values of this row.
> 
> T1 -> V1 ( original insert )
> T2 -> V2 ( first update )
> T3 -> V3 ( second update )
> T4 -> V4 ( third update )
> T5 -> V5 ( Tombstone/delete )
> 
> Now I want to perform a read scan. I am using the READ_AT_SNAPSHOT mode and using setSnapShotMicros()
 method to perform the read at that snapshot. I was wondering if I would have the flexibility
to get the following values provided I am using the snapshot times as follows :
> 
> 1. Can I get value V2 if I set snapshot time as t2 provided T2< t2 < T3 ?
> yes
>  
> 2. Can I get value V3 if I set snapshot time as t3 provided T3 < t3 <  T4 ?
> 
> yes
>  
> Also it is obvious for this to work properly  we will need two timestamps as part of
the API call ( lower and upper bound ) to retrieve value V2.  The usage of the word MVCC is
interesting and hence this question.
> 
> I'm not following what you mean by a lower and upper bound timestamp? The READ_AT_SNAPSHOT
setting means that you read the state of the table exactly as it was at the provided time.
So, if you provide a time in between T2 and T3, you will see the value that was most recently
committed before the specified time (i.e the value at T2)
> 
> 
>  
> 
> In other words, when we say Kudu has a MVCC style for data as an asset; is it for all
versions of the data mutation or just for the reconciliation stage ? I am assuming it is only
for the last stage of reconciliation ( i.e. until reads are fully committed ). Since timestamps
in Kudu seem to be for the lower bound markers, the above might not be possible but wanted
to check with the community.
> 
> It stores all history for a configurable amount of time (--tablet-history-max-age-sec,
default 15 minutes). You can bump this to a longer amount of time.
>  
> 
> If it is otherwise , does the model hold good after a compaction is performed ?
> 
> 
> Yes, as of version 1.2 (I think) the full history is properly retained regardless of
any compactions, etc, subject to the above mentioned history limit.
> 
> -Todd
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera
> 
> 
> 
> -- 
> Todd Lipcon
> Software Engineer, Cloudera


Mime
View raw message