accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Drob (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1940) Data file in !METADATA differs from in memory data
Date Wed, 11 Dec 2013 18:10:07 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13845594#comment-13845594
] 

Mike Drob commented on ACCUMULO-1940:
-------------------------------------

[~ecn]/[~elserj], is this resolved?

> Data file in !METADATA differs from in memory data
> --------------------------------------------------
>
>                 Key: ACCUMULO-1940
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1940
>             Project: Accumulo
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.5.0
>            Reporter: Josh Elser
>             Fix For: 1.4.5, 1.5.1, 1.6.0
>
>
> Found during CI run with agitation.
> Got the first two error messages 5 times (assuming in a retry on failure block):
> {noformat}
> Failed to do close consistency check for tablet c;79d0ab;7870a
> 	java.lang.RuntimeException: Data file in !METADATA differ from in memory data c;79d0ab;7870a
 {/t-0005h1j/A0005n8k.rf=797350457 19198312, /t-0005h1j/C0005skm.rf=798078368 19322025, /t-0005h1j/C0005tet.rf=89783168
2196349, /t-0005h1j/C0005u20.rf=90979448 2227972, /t-0005h1j/F0005u0v.rf=23410023 582233,
/t-0005h1j/F0005u2p.rf=21958551 547159, /t-0005h1j/F0005u3g.rf=14395121 358893}  {/t-0005h1j/A0005n8k.rf=797350457
19198312, /t-0005h1j/C0005skm.rf=798078368 19322025, /t-0005h1j/C0005tet.rf=89783168 2196349,
/t-0005h1j/C0005u20.rf=90979448 2227972, /t-0005h1j/F0005u2p.rf=21958551 547159, /t-0005h1j/F0005u3g.rf=14395121
358893}
> 		at org.apache.accumulo.server.tabletserver.Tablet.closeConsistencyCheck(Tablet.java:2847)
> 		at org.apache.accumulo.server.tabletserver.Tablet.completeClose(Tablet.java:2780)
> 		at org.apache.accumulo.server.tabletserver.Tablet.close(Tablet.java:2658)
> 		at org.apache.accumulo.server.tabletserver.TabletServer$UnloadTabletHandler.run(TabletServer.java:2357)
> 		at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
> 		at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
> 		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 		at org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
> 		at org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
> 		at java.lang.Thread.run(Thread.java:744)
> {noformat}
> Then, we logged that we failed the consistency check
> {noformat}
> Consistency check fails, retrying java.lang.RuntimeException: Failed to do close consistency
check for tablet c;79d0ab;7870a
> {noformat}
> In the end, we gave up and closed it anyways.
> {noformat}
> Tablet closed consistency check has failed for c;79d0ab;7870a giving up and closing
> {noformat}
> Before all of this happened, we tried to bring this tablet online after a failure on
a new tserver. During the minc as part of the recovery process, we failed to get the lease
on the .rf_tmp file we tried to create. We failed this a couple of times, but eventually got
the tmp file we needed and the recovery process completed and we could bring the tablet online.
The difference between the in-memory version and the !METADATA version was this one flushed
rfile that we created during this recovery process.
> The problem eventually fixed itself because the tablet was migrated to a different server
and we just took what was (correctly) in the !METADATA table.
> There still is an unknown issue of how we missed the flush RFile in the DatafileManager's
copy.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message