hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11772) Bulk load mvcc and seqId issues with native hfiles
Date Thu, 28 Aug 2014 04:35:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14113327#comment-14113327
] 

Jerry He commented on HBASE-11772:
----------------------------------

Still looking into the TestHRegionServerBulkLoad failure.

This is how this test work.
It only has a single test case:  testAtomicBulkLoad
It creates a separate thread that continuously bulk load hfiles (multiple column families)
into minicluster.
Each iteration of bulk load has the same values for all rows and columns (e.g. all loaded
values in iteration 1 is 000001)
At the same time the test creates multiple threads that continuously creates table scanners
to scan the same table.
The assert is that for each table scanner, the value for all the rows and columns will be
uniformly the same.
The purpose is to test if the bulk load is atomic.
This makes sense because each scanner will only see the result of a complete bulk load iteration.
It should not see half result from another bulk load iteration.

The atomic is guided by the region write lock in startBulkRegionOperation().  There is no
change in that area.

One thing to notice is that the test does not add BULKLOAD_TIME_KEY to the bulk load hfiles.
  isBulkLoadResult() will all be false.

I tried adding BULKLOAD_TIME_KEY when creating the bulk load files.  Then the test failed,
identical to the failure seen after applying this JRIA's patch.

> Bulk load mvcc and seqId issues with native hfiles
> --------------------------------------------------
>
>                 Key: HBASE-11772
>                 URL: https://issues.apache.org/jira/browse/HBASE-11772
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.5
>            Reporter: Jerry He
>            Assignee: Jerry He
>            Priority: Critical
>             Fix For: 0.99.0, 1.0.0, 2.0.0, 0.98.7
>
>         Attachments: HBASE-11772-0.98.patch, HBASE-11772-master-v1.patch
>
>
> There are mvcc and seqId issues when bulk load native hfiles -- meaning hfiles that are
direct file copy-out from hbase, not from HFileOutputFormat job.
> There are differences between these two types of hfiles.
> Native hfiles have possible non-zero MAX_MEMSTORE_TS_KEY value and non-zero mvcc values
in cells. 
> Native hfiles also have MAX_SEQ_ID_KEY.
> Native hfiles do not have BULKLOAD_TIME_KEY.
> Here are a couple of problems I observed when bulk load native hfiles.
> 1.  Cells in newly bulk loaded hfiles can be invisible to scan.
> It is easy to re-create.
> Bulk load a native hfile that has a larger mvcc value in cells, e.g 10
> If the current readpoint when initiating a scan is less than 10, the cells in the new
hfile are skipped, thus become invisible.
> We don't reset the readpoint of a region after bulk load.
> 2. The current StoreFile.isBulkLoadResult() is implemented as:
> {code}
> return metadataMap.containsKey(BULKLOAD_TIME_KEY)
> {code}
> which does not detect bulkloaded native hfiles.
> 3. Another observed problem is possible data loss during log recovery. 
> It is similar to HBASE-10958 reported by [~jdcryans]. Borrow the re-create steps from
HBASE-10958.
> 1) Create an empty table
> 2) Put one row in it (let's say it gets seqid 1)
> 3) Bulk load one native hfile with large seqId ( e.g. 100). The native hfile can be obtained
by copying out from existing table.
> 4) Kill the region server that holds the table's region.
> Scan the table once the region is made available again. The first row, at seqid 1, will
be missing since the HFile with seqid 100 makes us believe that everything that came before
it was flushed. 
> The problem 3 is probably related to 2. We will be ok if we get the appended seqId during
bulk load instead of 100 from inside the file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message