hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: Delete client API.
Date Tue, 17 Jan 2012 18:07:39 GMT
Yeah, it's confusing if one expects it to work like in a relational database.
You can even do worse. If you by accident place a delete in the future all current inserts
will be hidden until the next major compaction. :)
I got confused about this myself just recently (see my mail on the dev-list).


In the end this is a pretty powerful feature and core to how HBase works (not saying that
is not confusing though).


If one keeps the following two points in mind it makes more sense:
1. Delete just sets a tomb stone marker at a specific TS (marking everything older as deleted).
2. Everything is versioned, if no version is specified the current time (at the regionserver)
is used.

In your example1 below t3 > 6, hence the insert is hidden.
In example2 both delete and insert TS are 6, hence the insert is hidden.

Look at these two examples:

1. insert Val1  at real time t1
2. <del>  at real time t2 > t1
3. insert  Val2 at real time  t3 > t2

1. insert Val1  with TS=1 at real time t1
2. <del>  with TS = 2 at real time t2 > t1 

3. insert  Val2 with TS = 3 at real time  t3 > t2


In both cases Val2 is visible.

If the your code sets your own timestamps, you better know what you're doing :)

Note that my examples below are confusing even if you know how deletion in HBase works.
You have to look at Delete.java to figure out what is happening.
OK, since there were know objections in two days, I will commit my proposed change in HBASE-5205.


-- Lars

________________________________
From: M. C. Srivas <mcsrivas@gmail.com>
To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com> 
Sent: Tuesday, January 17, 2012 8:13 AM
Subject: Re: Delete client API.


Delete seems to be confusing in general. Here are some examples that make me scratch my head
(key is same in all the examples):

Example1:
----------------
1. insert Val3  with TS=3  at real time t1
2. insert Val5  with TS=5  at real time t2 > t1
3. <del>    at real time t3 > t2
4. insert  Val6  with TS=6  at real time  t4 > t3

What does a read return?  (I would expect  Val6, since it was done last). But depending
upon whether compaction happened or not between steps 3 and 4, I get either Val6 or  nothing.

Example 2:
-----------------
1. insert Val3  with TS=3  at real time t1
2. insert Val5  with TS=5  at real time t2 > t1
3. <del>  TS=6  at real time t3 > t2
4. insert  Val6  with TS=6  at real time  t4 > t3

Note the difference in step 3 is this time a TS was specified by the client.

What does a read return?  Again, I expect Val6 to be returned. But depending upon what's
going on, I seem to get either Val5 or Val6.





On Sun, Jan 15, 2012 at 7:21 PM, lars hofhansl <lhofhansl@yahoo.com> wrote:

There are some confusing parts about the Delete client API:
>1. calling deleteFamily removes all prior column or columns markers without checking the
TS.
>2. delete{Column|Columns|Family} do not use the timestamp passed to Delete at construction
time, but instead default to LATEST_TIMESTAMP.
>
>  Delete d = new Delete(R,T);
>  d.deleteFamily(CF);
>
>Does not do what you expect (won't use T for the family delete, but rather the current
time).
>
>Neither does
>  d.deleteColumns(CF, C1, T2);
>  d.deleteFamily(CF, T1); // T1 < T2
>
>
>(the columns marker will be removed)
>
>
>#1 prevents Delete from adding a family marker F for time T1 and a column/columns marker
for columns of F at T2 even if T2 > T1.
>#2 is just unexpected and different from what Put is doing.
>
>In HBASE-5205 I propose a simple patch to fix this.
>
>Since this is a (slight) API change, please provide feed back.
>
>Thanks.
>
>-- Lars
>
> 

Mime
View raw message