hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6679) RegionServer aborts due to race between compaction and split
Date Wed, 26 Sep 2012 05:54:08 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463548#comment-13463548
] 

Devaraj Das commented on HBASE-6679:
------------------------------------

bq. For sure the regions was not doubly-assigned? Split happened of the region on one server
but on another server, the same region was being compacted? You'd need the master logs to
figure it a dbl-assign

Unfortunately, didn't save the master logs when the failure happened.. 

bq. Can you figure a place where we'd be running compactions on a region concurrent w/ our
splitting it? Compacting we take out write lock. Doesnt look like any locks while SplitTransaction
is running (closing parent, it'll need write lock... thats after daughters open though).

I can't figure out a place where this could happen in the natural execution of the regionserver.

bq. Storefiles are an ImmutableList.

Yes.. but that still could be exposed to the problems of memory inconsistencies when multiple
threads are accessing the object in unsynchronized/non-volatile ways, no?

bq. @Deva

After a long time, someone addressed me by that name :-)

bq. So before this itself the region got closed. I feel the store file list should have been
updated by the time. No ?

Can't say Ram for sure. There is no guarantee unless the access (read/write) are synchronized
or the field is declared volatile..

                
> RegionServer aborts due to race between compaction and split
> ------------------------------------------------------------
>
>                 Key: HBASE-6679
>                 URL: https://issues.apache.org/jira/browse/HBASE-6679
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Devaraj Das
>            Assignee: Devaraj Das
>             Fix For: 0.92.3
>
>         Attachments: rs-crash-parallel-compact-split.log
>
>
> In our nightlies, we have seen RS aborts due to compaction and split racing. Original
parent file gets deleted after the compaction, and hence, the daughters don't find the parent
data file. The RS kills itself when this happens. Will attach a snippet of the relevant RS
logs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message