hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <>
Subject Re: Branch for "Per Table Write ID" implementation
Date Thu, 11 Jan 2018 06:06:59 GMT


On 1/9/18, 10:55 PM, "Sankar Hariappan" <> wrote:

    Hi all,
    "Hive Replication” feature is advancing to support ACID tables (HIVE-18320<>).
    “Per Table Write ID” is an important requirement to support replication for ACID tables
especially for the use case of “Analytics workload off-loading for scalability”. Details
are available in the design document attached in the JIRA.
    Per table Write ID implementation have several changes.
      1.  Add metadata tables to allocate and manage write ID. Also, map it against global
      2.  Handle snapshot isolation for ACID/MM table reads by using ValidWriteIDList instead
of ValidTxnList.
      3.  Modify ORC/Hive row readers to use ValidWriteIDList instead of ValidTxnList to read
valid delta/base directories.
      4.  Update ValidCompactorTxnList to use table Write Ids.
      5.  Upgrade from existing Hive versions by migrating the ACID/MM tables to use Write
ID instead of global transaction ID.
      6.  Correct the UT test scripts to use ValidWriteIDList instead of ValidTxnList for
snapshot isolation tests.
      7.  Rename the method/variable names of several classes to use WriteId instead of TxnId.
    As part of HIVE-18192<>, I have
implemented first 3 changes in the list which makes ACID read/write to work with Write ID
change. But, this feature will be incomplete without rest of the changes.
    Hence, I would like to create a branch (branch-per-table-writeid) from master to commit
this feature with multiple patches. This branch is expected to be short-lived for 2 to 3 weeks.
    Request feedback from the community.
    Best regards

View raw message