phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ohad Shacham (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-4484) Write directly to HBase when creating an index for transactional table
Date Mon, 30 Apr 2018 07:38:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16458377#comment-16458377
] 

Ohad Shacham commented on PHOENIX-4484:
---------------------------------------

[~jamestaylor], I think that I was wrong in this case and disabling the GC is not required.
A general transaction might miss data if the low watermark exceeds the transaction timestamp
during its run. This caused by the GC that removes all the versions of the key below the
low watermark, except for the last one.  During index population, the transaction has the
fence id and it writes the data using auto commit (version and commit timestamp are the same)
and does not need to commit. 

It is true that this transaction might miss data if the low watermark exceeds the fence id,
however, if it misses data of a key K, it means that there exists another record of K with
a version higher than the fence and lower than the low watermark. Because every entry written
after the fence will be automatically added to the index (using the incremental mechanism)
then the entry of K will be added to the index as well. It is true that we miss data, however,
every transaction that might be interested in this data started below the low watermark and
will be aborted on commit, so we don't really care. 

To sum up, the fact that at the fence, we enable the mechanism that updates the index with
every mutation to the data table. Removes the need to disable the GC.

 

> Write directly to HBase when creating an index for transactional table
> ----------------------------------------------------------------------
>
>                 Key: PHOENIX-4484
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4484
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: Ohad Shacham
>            Assignee: Ohad Shacham
>            Priority: Major
>
> Today, when creating an index table for a non empty data table. The writes are performed
using the transaction api and both consumes client side memory, for storing the writeset,
and checks for conflict analysis upon commit. This is redundant and can be replaced by direct
write to HBase. For this reason, a new function in the transaction abstraction layer should
be added that writes directly to HBase at the Tephra's case and adds shadow cells with the
fence id at the Omid case. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message