hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11368) Multi-column family BulkLoad fails if compactions go on too long
Date Fri, 14 Aug 2015 08:55:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14696724#comment-14696724

Enis Soztutar commented on HBASE-11368:

bq. How? Would the following case be true without the bulk load getting the region write lock?
We do not do this now, but in theory it can be done similarly to regular writes. 
 - obtain new seqId as a write transaction
 - bulk load all files across CFs with the seqId. 
 - advance mvcc read point only when all bulk loads are complete. 
This way the scanners are guaranteed to atomically observe the bulk loaded data atomically
without the region-write-lock. 
bq. In the 0.98 code line, we don't have seqid, and the atomicity is still guaranteed there.
Yes. Not worth changing 0.98 line. 
bq. I think it is being propagated properly to the scanner. Think about the same notifyChangedReadersObservers
is being used at the end of compaction and flushes as well. The reset of the readers should
I am not sure about that. Agreed that the cells at the store level will actually get re-ordered,
but the heap at the region level is never re-ordered. So, after a bulk load, the ordering
of store scanners at the region level might change, but the scanner will miss it if I understand
this correctly. 
bq. Atomicity may be a false blanket considering HBASE-4652 is still unresolved.
Very good point. We need a transactional commit for the BL files. 

> Multi-column family BulkLoad fails if compactions go on too long
> ----------------------------------------------------------------
>                 Key: HBASE-11368
>                 URL: https://issues.apache.org/jira/browse/HBASE-11368
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Qiang Tian
>         Attachments: hbase-11368-0.98.5.patch, hbase11368-master.patch, key_stacktrace_hbase10882.TXT,
> Compactions take a read lock.  If a multi-column family region, before bulk loading,
we want to take a write lock on the region.  If the compaction takes too long, the bulk load
> Various recipes include:
> + Making smaller regions (lame)
> + [~victorunique] suggests major compacting just before bulk loading over in HBASE-10882
as a work around.
> Does the compaction need a read lock for that long?  Does the bulk load need a full write
lock when multiple column families?  Can we fail more gracefully at least?

This message was sent by Atlassian JIRA

View raw message