hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8721) Deletes can mask puts that happen after the delete
Date Mon, 17 Jun 2013 15:54:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13685663#comment-13685663

Andrew Purtell commented on HBASE-8721:

bq. A). We all agree this IS a bug:

I don't think that is a correct statement.


bq. It is by nature not an acceptable behaviour. It's counter-common-sense and counter-intuition.
It now seems an 'expected behaviour' JUST because it exists from the very beginning.

Your point of view is well argued [~fenghh] but this assertion is your opinion. Strong opinions
are fine by me, but they sometimes don't help with building consensus for change. For example,
you exclude this possibility for moving forward to an outcome that would address your concerns
(at least in part):
C). Lars Hofhansl suggests to introduce a config for a Table/CF to disallow client to set
timestamps when put. As a config, it means client still can create tables/CFs that allow him
to explicitly set timestamps, and for these tables/CFs, bug of A) still exists.


bq. Disabling client set timestamps or limiting timestamp only with 'time' semantic will prohibit
such innovative usage of timestamp.

This may be the crux of the distance between your position and some of the feedback here.
As [~sershe] summarized quite well:
If you are setting explicit timestamps, you are explicitly telling HBase that it should withhold
judgement about versions because you know what happens logically before and after in your
system. If you are using timestamp otherwise for some convenience, you are misusing it. 

If this version semantic is removed, timestamp becomes simply a long tucked unto a KeyValue
and should be removed, after all, we don't have a string or a boolean also added to KeyValue
so that people could use them for their purposes. HBase already has columns and column families
to do that. Timestamp has very explicit semantics and purpose right now. If you want time-based
behavior then don't set timestamps and HBase will use time-based behavior.

Can you say a bit more about what exactly your clients are doing with timestamps?
> Deletes can mask puts that happen after the delete
> --------------------------------------------------
>                 Key: HBASE-8721
>                 URL: https://issues.apache.org/jira/browse/HBASE-8721
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Feng Honghua
>         Attachments: HBASE-8721-0.94-V0.patch
> this fix aims for bug mentioned in http://hbase.apache.org/book.html
> "Deletes mask puts, even puts that happened after the delete was entered. Remember that
a delete writes a tombstone, which only disappears after then next major compaction has run.
Suppose you do a delete of everything <= T. After this you do a new put with a timestamp
<= T. This put, even if it happened after the delete, will be masked by the delete tombstone.
Performing the put will not fail, but when you do a get you will notice the put did have no
effect. It will start working again after the major compaction has run. These issues should
not be a problem if you use always-increasing versions for new puts to a row. But they can
occur even if you do not care about time: just do delete and put immediately after each other,
and there is some chance they happen within the same millisecond."

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message