hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-18397) StoreFile accounting issues on branch-1.3 and branch-1
Date Thu, 05 Oct 2017 17:36:00 GMT

     [ https://issues.apache.org/jira/browse/HBASE-18397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-18397:
--------------------------
    Description: 
This jira is an umbrella for a set of issues around store file accounting on branch-1.3 and
branch-1 (I believe).

At this point I do believe that many / most of those issues are related to backport of HBASE-13082
done  long time ago. A number of related issues were identified and fixed previously, but
some still yet to be debugged and fixed. I think that this class of problems prevents us from
releasing 1.3.2 and moving stable pointer to branch 1.3 at this point, so marking as critical.

Below is overview by Andrew Purtell from dev list: (Subject: _Re: Branch 1.4 update_):
{quote}
Let me provide some context.

The root issue was fallout from a locking change introduced just prior to
release of 1.3. That change was HBASE-13082. Lars H proposed a change. It
was committed to trunk but quickly reverted. After the revert Lars decided
to drop the work rather than fix it for reapplication. However, the work
was picked up by others and eventually found its way into branch-1, then
branch-1.3, then 1.3.x releases. There were unintended side effects,
causing bugs. The umbrella issue HBASE-18397 tracks a bunch of fix work the
community has done since. The last known bug fix was HBASE-18771, found and
fixed by our Abhishek. The last known change I know of was work I did on
HBASE-18786 to remove some dodgy exception handling (prefer aborts to
silent data corruption). Is this enough to move the stable pointer?
According to our testing at Salesforce, yes, so far. We have yet to run in
full production. Give us a few months of that and my answer will be
unconditional one way or another. According to some offline conversation
with Mikhail and Gary, the answer is in fact no, they still have one hairy
use case causing occasional problems that look like more of this, but that
feedback predates HBASE-18771.{quote}

  was:
This jira is an umbrella for a set of issues around store file accounting on branch-1.3 and
branch-1 (I believe).

At this point I do believe that many / most of those issues are related to backport of HBASE-13082
done  long time ago. A number of related issues were identified and fixed previously, but
some still yet to be debugged and fixed. I think that this class of problems prevents us from
releasing 1.3.2 and moving stable pointer to branch 1.3 at this point, so marking as critical.

Below is a synopsis by Andrew Purtell from dev list: 
Let me provide some context.

The root issue was fallout from a locking change introduced just prior to
release of 1.3. That change was HBASE-13082. Lars H proposed a change. It
was committed to trunk but quickly reverted. After the revert Lars decided
to drop the work rather than fix it for reapplication. However, the work
was picked up by others and eventually found its way into branch-1, then
branch-1.3, then 1.3.x releases. There were unintended side effects,
causing bugs. The umbrella issue HBASE-18397 tracks a bunch of fix work the
community has done since. The last known bug fix was HBASE-18771, found and
fixed by our Abhishek. The last known change I know of was work I did on
HBASE-18786 to remove some dodgy exception handling (prefer aborts to
silent data corruption). Is this enough to move the stable pointer?
According to our testing at Salesforce, yes, so far. We have yet to run in
full production. Give us a few months of that and my answer will be
unconditional one way or another. According to some offline conversation
with Mikhail and Gary, the answer is in fact no, they still have one hairy
use case causing occasional problems that look like more of this, but that
feedback predates HBASE-18771.


> StoreFile accounting issues on branch-1.3 and branch-1
> ------------------------------------------------------
>
>                 Key: HBASE-18397
>                 URL: https://issues.apache.org/jira/browse/HBASE-18397
>             Project: HBase
>          Issue Type: Umbrella
>          Components: regionserver, Scanners
>    Affects Versions: 1.3.0
>            Reporter: Mikhail Antonov
>            Priority: Critical
>             Fix For: 2.0.0, 1.3.2
>
>
> This jira is an umbrella for a set of issues around store file accounting on branch-1.3
and branch-1 (I believe).
> At this point I do believe that many / most of those issues are related to backport of
HBASE-13082 done  long time ago. A number of related issues were identified and fixed previously,
but some still yet to be debugged and fixed. I think that this class of problems prevents
us from releasing 1.3.2 and moving stable pointer to branch 1.3 at this point, so marking
as critical.
> Below is overview by Andrew Purtell from dev list: (Subject: _Re: Branch 1.4 update_):
> {quote}
> Let me provide some context.
> The root issue was fallout from a locking change introduced just prior to
> release of 1.3. That change was HBASE-13082. Lars H proposed a change. It
> was committed to trunk but quickly reverted. After the revert Lars decided
> to drop the work rather than fix it for reapplication. However, the work
> was picked up by others and eventually found its way into branch-1, then
> branch-1.3, then 1.3.x releases. There were unintended side effects,
> causing bugs. The umbrella issue HBASE-18397 tracks a bunch of fix work the
> community has done since. The last known bug fix was HBASE-18771, found and
> fixed by our Abhishek. The last known change I know of was work I did on
> HBASE-18786 to remove some dodgy exception handling (prefer aborts to
> silent data corruption). Is this enough to move the stable pointer?
> According to our testing at Salesforce, yes, so far. We have yet to run in
> full production. Give us a few months of that and my answer will be
> unconditional one way or another. According to some offline conversation
> with Mikhail and Gary, the answer is in fact no, they still have one hairy
> use case causing occasional problems that look like more of this, but that
> feedback predates HBASE-18771.{quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message