hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Antonov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16074) ITBLL fails, reports lost big or tine families
Date Sun, 03 Jul 2016 10:08:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360492#comment-15360492
] 

Mikhail Antonov commented on HBASE-16074:
-----------------------------------------

[~stack]

so I gave up the the idea of reproducing this on minicluster - this is pretty unreliable,
and with the number of iterations required and the time it takes to run reasonable number
of local ITBLL iterations it's not really that much faster than doing manual bisect and redeploying
it every time on real cluster.

So I started off with 17b39763c96aaca45208cba0ce9ce4fb931eb959 (when branch-1.3 was cut off
branch-1, this one is good) and after 7-8 steps it looks like this is the commit that causes
problem - 4b69faa1903303419dfcf027a2268524816c7a35, HBASE-15650. The previous commit doesn't
seem to loose any data on verify step on multiple iterations, this one lost like 3 out of
4 consequent runs. Looking why things break.



> ITBLL fails, reports lost big or tine families
> ----------------------------------------------
>
>                 Key: HBASE-16074
>                 URL: https://issues.apache.org/jira/browse/HBASE-16074
>             Project: HBase
>          Issue Type: Bug
>          Components: integration tests
>    Affects Versions: 1.3.0
>            Reporter: Mikhail Antonov
>            Assignee: Mikhail Antonov
>            Priority: Blocker
>             Fix For: 1.3.0
>
>         Attachments: changes_to_stress_ITBLL.patch, changes_to_stress_ITBLL__a_bit_relaxed_.patch,
itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size distributed
test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or tiny families,
count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup issue, but need
figure it out. Opening this to raise awareness and see if someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message