hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Somogyi (Jira)" <j...@apache.org>
Subject [jira] [Reopened] (HBASE-24302) Add an "ignoreTimestamps" option (defaulted to false) to HashTable/SyncTable tool
Date Mon, 04 May 2020 11:56:00 GMT

     [ https://issues.apache.org/jira/browse/HBASE-24302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Peter Somogyi reopened HBASE-24302:

Please also commit this to branch-2.3.

> Add an "ignoreTimestamps" option (defaulted to false) to HashTable/SyncTable tool
> ---------------------------------------------------------------------------------
>                 Key: HBASE-24302
>                 URL: https://issues.apache.org/jira/browse/HBASE-24302
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 3.0.0-alpha-1, 2.3.0, 2.2.5
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Major
>             Fix For: 3.0.0-alpha-1, 2.3.0, 2.2.5
> Currently, when hashing and comparing values between a source and a target table, HashTable/SyncTable
always consider cell timestamp values. However, cell timestamp values are not always relevant
for client applications, so these use cases could benefit of a more flexible comparison logic
where timestamps could be ignored.
> For such scenarios, HashTable/SyncTable could have better performance, since cells with
only timestamps diverging would not be copied. 
> Another case that would benefit from this option is when bulk deletes are wrongly applied
at target. At the moment, HashTable/SyncTable on it's own is not capable of syncing back the
clusters, as the source Puts would have an older TS than the delete markers in the target.
That would require target to complete major compaction on the whole table before HashTable/SyncTable
could be run.

This message was sent by Atlassian Jira

View raw message