hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17712) Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
Date Wed, 01 Mar 2017 11:15:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15889990#comment-15889990
] 

stack commented on HBASE-17712:
-------------------------------

bq. What happens if there is a flush ongoing at the same time?

I see. Looks like cruft built on top of cruft. Its a while since I was in here. Replacement
of current set of hfiles was always a little awkward. We didn't want every access going across
a synchronization just to check for the extremely rare case of a change in the store file
Set. I'd have to do some archeology to see if retry of FNFE was a compromise so we could do
w/o a sync check. Would be coolio if we could purge having to handle FNFE.

I don't follow the comment on why the call to dropMemstoreContents was added to doDelta by:

{code}
tree 11b5d28bb22d95bd5c6276346f3055412b2d6902
parent dda8f67b2cc9f6ef4ab434beea2a47d461a20a1f
author tedyu <yuzhihong@gmail.com> Wed Aug 24 09:04:47 2016 -0700
committer tedyu <yuzhihong@gmail.com> Wed Aug 24 09:04:47 2016 -0700

HBASE-16304 HRegion#RegionScannerImpl#handleFileNotFoundException may lead to deadlock when
trying to obtain write lock on updatesLock

{code}

Looking at my review of HBASE-16304, my last remark was: "I'm not sure I follow the dropMemstoreContents();
bits. Some more commentary on interrelation might help" ... to which the response was that
there was explanation (I don't see it...).  Ram asks what it is about later also.... It doesn't
look like he got a straight response.

Can you help here [~tedyu]?

> Remove/Simplify the logic of RegionScannerImpl.handleFileNotFound
> -----------------------------------------------------------------
>
>                 Key: HBASE-17712
>                 URL: https://issues.apache.org/jira/browse/HBASE-17712
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0, 1.4.0
>            Reporter: Duo Zhang
>             Fix For: 2.0.0, 1.4.0
>
>
> It is introduced in HBASE-13651 and the logic became much more complicated after HBASE-16304
due to a dead lock issue. It is really tough as sequence id is involved in and the method
we called is used to serve secondary replica originally which does not handle write.
> In fact, in 1.x release, the problem described in HBASE-13651 is gone. Now we will write
a compaction marker to WAL before deleting the compacted files. We can only consider a RS
as dead after its WAL files are all closed so if the region has already been reassigned the
compaction will fail as we can not write out the compaction marker.
> So theoretically, if we still hit FileNotFound exception, it should be a critical bug
which means we may loss data. I do not think it is a good idea to just eat the exception and
refresh store files. Or even if we want to do this, we can just refresh store files without
dropping memstore contents. This will also simplify the logic a lot.
> Suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message