hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10958) [dataloss] Bulk loading with seqids can prevent some log entries from being replayed
Date Wed, 16 Apr 2014 04:18:21 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13970404#comment-13970404
] 

Andrew Purtell commented on HBASE-10958:
----------------------------------------

We expect READ and WRITE perms granted in a fine grained way to constraint who can do individual
ops that only collectively add up to cluster impacting events like compactions, splits, and
flushes. For actions that can have a global cluster impact, we'd like ADMIN to be granted
sparingly to admins or delegates. IIRC enable and disable are ADMIN actions also, since disabling
or enabling a 10000 region table has consequences. CREATE is kind of a middle ground for schema
reads and updates, but in terms of schema update that's splitting hairs I suppose since a
schema update of said large table would also have consequences of the same scale.

Bulk load is a special snowflake because it's a series of puts (so, WRITE) yet obviously more
than that as mentioned, we need to flush, and moving files in place will probably kick off
compaction. Making bulk load an ADMIN action, or CREATE, makes sense to me also.

> [dataloss] Bulk loading with seqids can prevent some log entries from being replayed
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-10958
>                 URL: https://issues.apache.org/jira/browse/HBASE-10958
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.96.2, 0.98.1, 0.94.18
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.99.0, 0.94.19, 0.98.2, 0.96.3
>
>         Attachments: HBASE-10958-less-intrusive-hack-0.96.patch, HBASE-10958-quick-hack-0.96.patch,
HBASE-10958-v2.patch, HBASE-10958.patch
>
>
> We found an issue with bulk loads causing data loss when assigning sequence ids (HBASE-6630)
that is triggered when replaying recovered edits. We're nicknaming this issue *Blindspot*.
> The problem is that the sequence id given to a bulk loaded file is higher than those
of the edits in the region's memstore. When replaying recovered edits, the rule to skip some
of them is that they have to be _lower than the highest sequence id_. In other words, the
edits that have a sequence id lower than the highest one in the store files *should* have
also been flushed. This is not the case with bulk loaded files since we now have an HFile
with a sequence id higher than unflushed edits.
> The log recovery code takes this into account by simply skipping the bulk loaded files,
but this "bulk loaded status" is *lost* on compaction. The edits in the logs that have a sequence
id lower than the bulk loaded file that got compacted are put in a blind spot and are skipped
during replay.
> Here's the easiest way to recreate this issue:
>  - Create an empty table
>  - Put one row in it (let's say it gets seqid 1)
>  - Bulk load one file (it gets seqid 2). I used ImporTsv and set hbase.mapreduce.bulkload.assign.sequenceNumbers.
>  - Bulk load a second file the same way (it gets seqid 3).
>  - Major compact the table (the new file has seqid 3 and isn't considered bulk loaded).
>  - Kill the region server that holds the table's region.
>  - Scan the table once the region is made available again. The first row, at seqid 1,
will be missing since the HFile with seqid 3 makes us believe that everything that came before
it was flushed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message