hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16649) Truncate table with splits preserved can cause both data loss and truncated data appeared again
Date Tue, 27 Sep 2016 02:13:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15524799#comment-15524799
] 

Hudson commented on HBASE-16649:
--------------------------------

FAILURE: Integrated in Jenkins build HBase-1.2-JDK8 #30 (See [https://builds.apache.org/job/HBase-1.2-JDK8/30/])
HBASE-16649 Truncate table with splits preserved can cause both data (matteo.bertozzi: rev
2733e24d3f2f110ac98d8876ee1de1fd9740b51e)
* (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestTruncateTableProcedure.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java
* (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteTableProcedure.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java


> Truncate table with splits preserved can cause both data loss and truncated data appeared
again
> -----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-16649
>                 URL: https://issues.apache.org/jira/browse/HBASE-16649
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.1.3
>            Reporter: Allan Yang
>            Assignee: Matteo Bertozzi
>             Fix For: 2.0.0, 1.3.0, 1.1.7, 0.98.23, 1.2.4
>
>         Attachments: HBASE-16649-v0.patch, HBASE-16649-v1.patch, HBASE-16649-v2.patch
>
>
> Since truncate table with splits preserved will delete hfiles and use the previous regioninfo.
It can cause odd behaviors
> - Case 1: *Data appeared after truncate*
> reproduce procedureļ¼š
> 1. create a table, let's say 'test'
> 2. write data to 'test', make sure memstore of 'test' is not empty
> 3. truncate 'test' with splits preserved
> 4. kill the regionserver hosting the region(s) of 'test'
> 5. start the regionserver, now it is the time to witness the miracle! the truncated data
appeared in table 'test'
> - Case 2: *Data loss*
> reproduce procedure:
> 1. create a table, let's say 'test'
> 2. write some data to 'test', no matter how many
> 3. truncate 'test' with splits preserved
> 4. restart the regionserver to reset the seqid
> 5. write some data, but less than 2 since we don't want the seqid to run over the one
in 2
> 6. kill the regionserver hosting the region(s) of 'test'
> 7. restart the regionserver. Congratulations! the data writen in 4 is now all lost
> *Why?*
> for case 1
> Since preserve splits in truncate table procedure will not change the regioninfo, when
log replay happens, the 'unflushed' data will restore back to the region
> for case 2
> since the flushedSequenceIdByRegion are stored in Master in a map with the region's encodedName.
Although the table is truncated, the region's name is not changed since we chose to preserve
the splits. So after truncate the table, the region's sequenceid is reset in the regionserver,
but not reset in master. When flush comes and report to master, master will reject the update
of sequenceid since the new one is smaller than the old one. The same happens in log replay,
all the edits writen in 4 will be skipped since they have a smaller seqid



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message