hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Kyle Purtell (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-24440) Prevent temporal misordering on timescales smaller than one clock tick
Date Mon, 01 Jun 2020 18:13:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-24440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17121223#comment-17121223

Andrew Kyle Purtell commented on HBASE-24440:

Correct [~anoop.hbase] , two versions with two distinct timestamps.... instead of duplicate
row keys with only something like an internal only seqno to differentiate them (which is not

We can also consider removing the implicit sort-delete-before-put rule that can cause temporal
anomalies under some conditions, but that is out of scope for this proposal.

> Prevent temporal misordering on timescales smaller than one clock tick
> ----------------------------------------------------------------------
>                 Key: HBASE-24440
>                 URL: https://issues.apache.org/jira/browse/HBASE-24440
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Andrew Kyle Purtell
>            Priority: Major
> When mutations are sent to the servers without a timestamp explicitly assigned by the
client the server will substitute the current wall clock time. There are edge cases where
it is at least theoretically possible for more than one mutation to be committed to a given
row within the same clock tick. When this happens we have to track and preserve the ordering
of these mutations in some other way besides the timestamp component of the key. Let me bypass
most discussion here by noting that whether we do this or not, we do not pass such ordering
information in the cross cluster replication protocol. We also have interesting edge cases
regarding key type precedence when mutations arrive "simultaneously": we sort deletes ahead
of puts. This, especially in the presence of replication, can lead to visible anomalies for
clients able to interact with both source and sink. 
> There is a simple solution that removes the possibility that these edge cases can occur:

> We can detect, when we are about to commit a mutation to a row, if we have already committed
a mutation to this same row in the current clock tick. Occurrences of this condition will
be rare. We are already tracking current time. We have to know this in order to assign the
timestamp. Where this becomes interesting is how we might track the last commit time per row.
Making the detection of this case efficient for the normal code path is the bulk of the challenge.
One option is to keep track of the last locked time for row locks. (Todo: How would we track
and garbage collect this efficiently and correctly. Not the ideal option.) We might also do
this tracking somehow via the memstore. (At least in this case the lifetime and distribution
of in memory row state, including the proposed timestamps, would align.) Assuming we can efficiently
know if we are about to commit twice to the same row within a single clock tick, we would
simply sleep/yield the current thread until the clock ticks over, and then proceed. 

This message was sent by Atlassian Jira

View raw message