hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16074) ITBLL fails, reports lost big or tine families
Date Mon, 04 Jul 2016 02:35:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360763#comment-15360763
] 

stack commented on HBASE-16074:
-------------------------------

Do you see "Log.warn("BAD: MINIMUM IS " + INITIAL_MIN_TIMESTAMP + " " + this)" in the logs?

There are two possible issues: 1.) that we'd write a TimeRange that got corrupted because
it was the setTimeRange wasn't seen by the writer and 2.) that somehow the Read TimeRange
got incorrectly initialized such that all Cells in a File were outside the time range. The
patch should address the first issue but for the second, only logs a message.

On...

bq. Hm, you deleted ...

No. Just had it call through to another constructor. As is the constructor could say allTime
false when it could be true (harmless.. just slow us down).

bq. ...TRT shouldn't used on the read path now

Right.. but 1. and 2. above could be making it so a Reader have a messed-up TimeRange on it
such that we'd not return Cells from it.

Let me look more at the patch to see if other possible holes.



> ITBLL fails, reports lost big or tine families
> ----------------------------------------------
>
>                 Key: HBASE-16074
>                 URL: https://issues.apache.org/jira/browse/HBASE-16074
>             Project: HBase
>          Issue Type: Bug
>          Components: integration tests
>    Affects Versions: 1.3.0
>            Reporter: Mikhail Antonov
>            Assignee: Mikhail Antonov
>            Priority: Blocker
>             Fix For: 1.3.0
>
>         Attachments: 16074.test.branch-1.3.patch, 16074.test.patch, changes_to_stress_ITBLL.patch,
changes_to_stress_ITBLL__a_bit_relaxed_.patch, itbll log with failure, itbll log with success
>
>
> Underlying MR jobs succeed but I'm seeing the following in the logs (mid-size distributed
test cluster):
> ERROR test.IntegrationTestBigLinkedList$Verify: Found nodes which lost big or tiny families,
count=164
> I do not know exactly yet whether it's a bug, a test issue or env setup issue, but need
figure it out. Opening this to raise awareness and see if someone saw that recently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message