zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Jaton <benjamin.ja...@gmail.com>
Subject Re: CRC check failed
Date Fri, 14 Aug 2015 21:13:38 GMT
OK, so those transactions are probably noise then.

I changed the LogFormatter to not stop when it detects CRC errors so that I
could see what nodes were in error.
It turns out that those errors appear on larger nodes (>5MB).
We do override jute.maxbuffer to more than the default 1MB.

Could there be some unexpected behavior with larger nodes?

On Fri, Aug 14, 2015 at 11:40 AM, Flavio Junqueira <
fpjunqueira@yahoo.com.invalid> wrote:

> Note that the zxid is different in each of the lines.
>
> -Flavio
>
> > On 14 Aug 2015, at 19:27, Benjamin Jaton <benjamin.jaton@gmail.com>
> wrote:
> >
> > I see a lot of those:
> >
> > *8/11/15 11:57:57 AM PDT session 0x14efe8c1ba2000b cxid 0x1ba004 zxid
> > 0x78b5a setData '/path/to/node,(....),60221*
> > immediately followed by
> > *8/11/15 11:57:57 AM PDT session 0x14efe8c1ba2000b cxid 0x1ba00a zxid
> > 0x78b5b error -110*
> >
> > Is -110 a NodeExists exception? A setData should not be in error if the
> > node already exist no?
> >
> >
> > On Fri, Aug 14, 2015 at 9:50 AM, Benjamin Jaton <
> benjamin.jaton@gmail.com>
> > wrote:
> >
> >> This is what I have when I run the LogFormatter on the last tlog:
> >>
> >> *Exception in thread "main" java.io.IOException: CRC doesn't match
> >> 1863394799 vs 480060806*
> >>
> >> Do you have any idea on what could cause this to happen?
> >>
> >> On Thu, Aug 13, 2015 at 4:16 PM, Flavio Junqueira <fpj@apache.org>
> wrote:
> >>
> >>> Hi Benjamin,
> >>>
> >>> Have you tried LogFormatter to check the file? I guess it won't work if
> >>> your file is really corrupt, but it might be worth a shot.
> Unfortunately,
> >>> I'm not aware of a way around it other than deleting the file and and
> >>> losing at least part of the transactions in the log.
> >>>
> >>> -Flavio
> >>>
> >>>> On 13 Aug 2015, at 17:27, Benjamin Jaton <benjamin.jaton@gmail.com>
> >>> wrote:
> >>>>
> >>>> Hello,
> >>>>
> >>>> Does anybody know how those "CRC check failed" errors can occur?
> >>>>
> >>>> Apparently the data has been corrupted, what would be the most likely
> >>>> scenario for this to happen?
> >>>>
> >>>> java.io.IOException: CRC check failed
> >>>> at
> >>>>
> >>>
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:612)
> >>>>
> >>>> at
> >>>>
> >>>
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:157)
> >>>>
> >>>> at
> >>>
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
> >>>> at
> >>>>
> >>>
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:272)
> >>>>
> >>>> at
> >>>>
> >>>
> org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:399)
> >>>>
> >>>> at
> >>>>
> >>>
> org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:122)
> >>>>
> >>>> at
> >>>>
> >>>
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:113)
> >>>>
> >>>> at
> >>>>
> >>>
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> >>>>
> >>>> at
> >>>>
> >>>
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
> >>>>
> >>>>
> >>>> As a result my ZK can't be started, is it recoverable?
> >>>>
> >>>> Thanks,
> >>>> Ben
> >>>>
> >>>> ---------- Forwarded message ----------
> >>>> From: Apache Jenkins Server <jenkins@builds.apache.org>
> >>>> Date: Sat, Apr 26, 2014 at 11:35 PM
> >>>> Subject: ZooKeeper_branch33_solaris - Build # 867 - Still Failing
> >>>> To: dev@zookeeper.apache.org
> >>>>
> >>>>
> >>>> See https://builds.apache.org/job/ZooKeeper_branch33_solaris/867/
> >>>>
> >>>>
> >>>
> ###################################################################################
> >>>> ########################## LAST 60 LINES OF THE CONSOLE
> >>>> ###########################
> >>>> [...truncated 301 lines...]
> >>>>   [junit] java.io.IOException: CRC check failed
> >>>>   [junit]     at
> >>>>
> >>>
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:611)
> >>>>   [junit]     at
> >>>> org.apache.zookeeper.server.CRCTest.testChecksums(CRCTest.java:165)
> >>>>   [junit]     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> >>>> Method)
> >>>>   [junit]     at
> >>>>
> >>>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> >>>>   [junit]     at
> >>>>
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>>>   [junit]     at java.lang.reflect.Method.invoke(Method.java:597)
> >>>>   [junit]     at junit.framework.TestCase.runTest(TestCase.java:168)
> >>>>   [junit]     at junit.framework.TestCase.runBare(TestCase.java:134)
> >>>>   [junit]     at
> >>> junit.framework.TestResult$1.protect(TestResult.java:110)
> >>>>   [junit]     at
> >>>> junit.framework.TestResult.runProtected(TestResult.java:128)
> >>>>   [junit]     at junit.framework.TestResult.run(TestResult.java:113)
> >>>>   [junit]     at junit.framework.TestCase.run(TestCase.java:124)
> >>>>   [junit]     at junit.framework.TestSuite.runTest(TestSuite.java:232)
> >>>>   [junit]     at junit.framework.TestSuite.run(TestSuite.java:227)
> >>>>   [junit]     at
> >>>>
> >>>
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
> >>>>   [junit]     at
> >>>> junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
> >>>>   [junit]     at
> >>>>
> >>>
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:421)
> >>>>   [junit]     at
> >>>>
> >>>
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:912)
> >>>>   [junit]     at
> >>>>
> >>>
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:766)
> >>>>   [junit] 2014-04-27 06:35:01,089 - INFO  [main:CRCTest@68] -
> FINISHED
> >>>> testChecksums
> >>>>   [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 54.234
> >>> sec
> >>>>   [junit] java.io.FileNotFoundException:
> >>>>
> >>>
> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/junitvmwatcher1383325470.properties
> >>>> (No such file or directory)
> >>>>   [junit]     at java.io.FileInputStream.open(Native Method)
> >>>>   [junit]     at
> >>> java.io.FileInputStream.<init>(FileInputStream.java:120)
> >>>>   [junit]     at java.io.FileReader.<init>(FileReader.java:55)
> >>>>   [junit]     at
> >>>>
> >>>
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTask.executeAsForked(JUnitTask.java:1028)
> >>>>   [junit]     at
> >>>>
> >>>
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTask.execute(JUnitTask.java:817)
> >>>>   [junit]     at
> >>>>
> >>>
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTask.executeOrQueue(JUnitTask.java:1657)
> >>>>   [junit]     at
> >>>>
> >>>
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTask.execute(JUnitTask.java:764)
> >>>>   [junit]     at
> >>>> org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:288)
> >>>>   [junit]     at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown
> >>>> Source)
> >>>>   [junit]     at
> >>>>
> >>>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >>>>   [junit]     at java.lang.reflect.Method.invoke(Method.java:597)
> >>>>   [junit]     at
> >>>>
> >>>
> org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:105)
> >>>>   [junit]     at org.apache.tools.ant.Task.perform(Task.java:348)
> >>>>   [junit]     at org.apache.tools.ant.Target.execute(Target.java:357)
> >>>>   [junit]     at
> >>> org.apache.tools.ant.Target.performTasks(Target.java:385)
> >>>>   [junit]     at
> >>>> org.apache.tools.ant.Project.executeSortedTargets(Project.java:1329)
> >>>>   [junit]     at
> >>>> org.apache.tools.ant.Project.executeTarget(Project.java:1298)
> >>>>   [junit]     at
> >>>>
> >>>
> org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
> >>>>   [junit]     at
> >>>> org.apache.tools.ant.Project.executeTargets(Project.java:1181)
> >>>>   [junit]     at org.apache.tools.ant.Main.runBuild(Main.java:698)
> >>>>   [junit]     at org.apache.tools.ant.Main.startAnt(Main.java:199)
> >>>>   [junit]     at
> >>>> org.apache.tools.ant.launch.Launcher.run(Launcher.java:257)
> >>>>   [junit]     at
> >>>> org.apache.tools.ant.launch.Launcher.main(Launcher.java:104)
> >>>>   [junit] Running org.apache.zookeeper.server.DataTreeUnitTest
> >>>>   [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
> >>>>
> >>>> BUILD FAILED
> >>>>
> >>>
> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build.xml:814:
> >>>> Process fork failed.
> >>>>
> >>>> Total time: 2 minutes 54 seconds
> >>>> Build step 'Invoke Ant' marked build as failure
> >>>> [locks-and-latches] Releasing all the locks
> >>>> [locks-and-latches] All the locks released
> >>>> Recording test results
> >>>> Email was triggered for: Failure
> >>>> Sending email for trigger: Failure
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> ###################################################################################
> >>>> ############################## FAILED TESTS (if any)
> >>>> ##############################
> >>>> 1 tests failed.
> >>>> FAILED:  org.apache.zookeeper.server.DataTreeUnitTest.unknown
> >>>>
> >>>> Error Message:
> >>>> Forked Java VM exited abnormally. Please note the time in the report
> >>> does
> >>>> not reflect the time until the VM exit.
> >>>>
> >>>> Stack Trace:
> >>>> junit.framework.AssertionFailedError: Forked Java VM exited
> abnormally.
> >>>> Please note the time in the report does not reflect the time until the
> >>> VM
> >>>> exit.
> >>>
> >>>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message