hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-14070) Hybrid Logical Clocks for HBase
Date Fri, 09 Sep 2016 23:57:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478593#comment-15478593
] 

stack edited comment on HBASE-14070 at 9/9/16 11:56 PM:
--------------------------------------------------------

The patch - HBase-14070.master.001.patch contains the current unfinished changes for the
 HLC project. A document describing current state of the work and work yet to
 be done is attached in the HBASE-14070. It is present in the links section of this JIRA,
it is named Current Status of HLC. 

h2. Work Done
Timestamp Enum Class with all the methods with the tests
Clock Class supporting three clock types with the tests
HLC clock updates during recovery and replication
Clock per Region Server notion
TTL and Time to Purge work well across all Clock Types
Most of the time related tests are parameterized to run against all the three clock types.
Test cases setting timestamps in Put were changed so that timestamp is not set with the help
of manual environment edge
The tests were run with HLC as default clock type of the tables and most of the bugs have
been removed. There is still some work to be done to get all the tests passed.

h2. Work Yet to done
Time range should be dealt with as per section 3.3
A check, for HLC and System Monotonic tables, disallowing clients to set the timestamps in
the server side and not on the client side (Keep Client dumb)
Currently, still some test cases are flaky, some are failing. Need to clear them.
HLC clock should be updates for more events such open, close regions etc. as per various use
cases. Deciding which events we should track is important.
Bulk Loads need to be thought upon. Mechanism to ensure that the bulk loaded files are having
correct timestamp types. If we can have the highest timestamp of all the cells, we can update
the local clock with it. This is something to think about.
Test cases written currently are very local, need to write some integration tests which tests
the HLC clock properties w.r.t recovery, replication and more.

Currently some of the tests were rewritten using manual environment edge to get around the
problem of not setting timestamps for HLC tables. A suggestion was made by Stack and Enis
to not use environment edge in the clocks. Instead have a pluggable manual clock in place
of Environment Edge.


was (Author: saitejar):
The patch - HBase-14070.master.001.patch contains the current unfinished changes for the
 HLC project. A document describing current state of the work and work yet to
 be done is attached in the HBASE-14070. It is present in the links section of this JIRA,
it is named Current Status of HLC. 

Work Done:
Timestamp Enum Class with all the methods with the tests
Clock Class supporting three clock types with the tests
HLC clock updates during recovery and replication
Clock per Region Server notion
TTL and Time to Purge work well across all Clock Types
Most of the time related tests are parameterized to run against all the three clock types.
Test cases setting timestamps in Put were changed so that timestamp is not set with the help
of manual environment edge
The tests were run with HLC as default clock type of the tables and most of the bugs have
been removed. There is still some work to be done to get all the tests passed.

Work Yet to done:
Time range should be dealt with as per section 3.3
A check, for HLC and System Monotonic tables, disallowing clients to set the timestamps in
the server side and not on the client side (Keep Client dumb)
Currently, still some test cases are flaky, some are failing. Need to clear them.
HLC clock should be updates for more events such open, close regions etc. as per various use
cases. Deciding which events we should track is important.
Bulk Loads need to be thought upon. Mechanism to ensure that the bulk loaded files are having
correct timestamp types. If we can have the highest timestamp of all the cells, we can update
the local clock with it. This is something to think about.
Test cases written currently are very local, need to write some integration tests which tests
the HLC clock properties w.r.t recovery, replication and more.

Currently some of the tests were rewritten using manual environment edge to get around the
problem of not setting timestamps for HLC tables. A suggestion was made by Stack and Enis
to not use environment edge in the clocks. Instead have a pluggable manual clock in place
of Environment Edge.

> Hybrid Logical Clocks for HBase
> -------------------------------
>
>                 Key: HBASE-14070
>                 URL: https://issues.apache.org/jira/browse/HBASE-14070
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Enis Soztutar
>            Assignee: Sai Teja Ranuva
>         Attachments: HBASE-14070.master.001.patch, HybridLogicalClocksforHBaseandPhoenix.docx,
HybridLogicalClocksforHBaseandPhoenix.pdf
>
>
> HBase and Phoenix uses systems physical clock (PT) to give timestamps to events (read
and writes). This works mostly when the system clock is strictly monotonically increasing
and there is no cross-dependency between servers clocks. However we know that leap seconds,
general clock skew and clock drift are in fact real. 
> This jira proposes using Hybrid Logical Clocks (HLC) as an implementation of hybrid physical
clock + a logical clock. HLC is best of both worlds where it keeps causality relationship
similar to logical clocks, but still is compatible with NTP based physical system clock. HLC
can be represented in 64bits. 
> A design document is attached and also can be found here: 
> https://docs.google.com/document/d/1LL2GAodiYi0waBz5ODGL4LDT4e_bXy8P9h6kWC05Bhw/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message