accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-681) close consistency check failure
Date Tue, 10 Jul 2012 13:43:33 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410338#comment-13410338
] 

Eric Newton commented on ACCUMULO-681:
--------------------------------------

||The files on disk:||
|/root_tablet/A0002pyh.rf|
|/root_tablet/F0002pu6.rf| 
|/root_tablet/F0002qg7.rf| 
|/root_tablet/F0002qg8.rf| 
|/root_tablet/F0002ql1.rf|

||The files in memory:||
|/root_tablet/A0002mqs.rf|
|/root_tablet/F0002mrf.rf|
|/root_tablet/F0002n8x.rf|
|/root_tablet/F0002oda.rf|
|/root_tablet/F0002orh.rf|
|/root_tablet/F0002p03.rf|
|/root_tablet/F0002p3m.rf|
|/root_tablet/F0002p85.rf|
|/root_tablet/F0002p87.rf|
|/root_tablet/F0002pb8.rf|
|/root_tablet/F0002pi1.rf|
|/root_tablet/F0002pu2.rf|
|/root_tablet/F0002pu6.rf|
|/root_tablet/F0002qg7.rf|
|/root_tablet/F0002qg8.rf|
|/root_tablet/F0002ql1.rf|

So, what happened to A0002mqs, for example?

It was created by node3 at 11:58:11:
{noformat}
11:58:11,159 [tabletserver.Tablet] TABLET_HIST: !0;!0<< MajC [/root_tablet/A0002l9z.rf,
/root_tablet/F0002la1.rf] --> /root_tablet/A0002mqs.rf
{noformat}

It was major compacted at 12:00:11 by node4:
{noformat}

{noformat}
12:00:11,786 [tabletserver.Tablet] TABLET_HIST: !0;!0<< MajC [/root_tablet/A0002mqs.rf,
/root_tablet/F0002mrf.rf, /root_tablet/F0002n8x.rf, /root_tablet/F0002oda.rf, /root_tablet/F0002orh.rf,
/root_tablet/F0002p03.rf, /root_tablet/F0002p3m.rf, /root_tablet/F0002p85.rf, /root_tablet/F0002p87.rf,
/root_tablet/F0002pb8.rf] --> /root_tablet/C0002pyf.rf
{noformat}

Let's see where the tablet lived between 11:58 and 12:00:

{noformat}
10 11:56:24,664 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node6
10 11:57:04,971 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node6
10 11:57:04,986 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node9
10 11:57:06,991 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node9
10 11:57:07,074 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node12
10 11:57:50,187 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node12
10 11:57:50,301 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node3
10 11:59:04,506 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node3
10 11:59:04,527 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node9
10 11:59:15,736 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node9
10 11:59:15,758 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node10
10 11:59:20,652 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node10
10 11:59:20,713 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node11
10 11:59:24,687 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node11
10 11:59:24,743 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node9
10 11:59:35,354 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node9
10 11:59:35,530 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node11
10 11:59:39,069 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node11
10 11:59:39,117 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node12
10 11:59:39,169 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node12
10 11:59:39,325 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node10
10 11:59:41,532 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node10
10 11:59:41,696 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node9
10 11:59:43,803 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node9
10 11:59:43,903 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node6
10 11:59:58,634 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node6
10 11:59:58,730 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node7
10 12:00:00,676 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node7
10 12:00:00,823 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node10
10 12:00:07,604 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node10
10 12:00:07,668 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node10
10 12:00:07,681 [tabletserver.Tablet] TABLET_HIST: !0;!0<< closed   node10
10 12:00:07,765 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node4
10 12:00:07,783 [tabletserver.Tablet] TABLET_HIST: !0;!0<< opened    node10  <---
{noformat}

Well, the tablet certainly moves around. The double-open seems to be suspicious.

                
> close consistency check failure
> -------------------------------
>
>                 Key: ACCUMULO-681
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-681
>             Project: Accumulo
>          Issue Type: Bug
>    Affects Versions: 1.5.0-SNAPSHOT
>         Environment: 10 node test cluster running randomwalk Concurrent graph
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>            Priority: Blocker
>             Fix For: 1.5.0
>
>
> I recently added tablet server admin shutdown to the Concurrent randomwalk test.
> After an hour, the following table problem appeared:
> {noformat}
> 10 12:00:21,755 [tabletserver.Tablet] ERROR: Data file in !METADATA differ from in memory
data !0;!0<<  
> {noformat}
> Note that this is the root tablet, so some other process was updating the files behind
this server's back; I suspect inconsistent assignment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message