Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm
Precedence: bulk
Reply-To: <derby-dev@db.apache.org>
Message-ID: <922664468.1242995565669.JavaMail.jira@brutus>
Date: Fri, 22 May 2009 05:32:45 -0700 (PDT)
From: "Kathey Marsden (JIRA)" <jira@apache.org>
To: derby-dev@db.apache.org
Subject: [jira] Updated: (DERBY-4239) corruption on z/OS with storerecovery
 oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
In-Reply-To: <947613656.1242862665627.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/DERBY-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kathey Marsden updated DERBY-4239:
----------------------------------

    Attachment: badlogsizes.txt
                goodlogsizes.txt
                wombat_keeplog_notcorrupt.zip

attached is a database from when we are able to reconnect and run CheckTables successfully (wombat_keeplog_notcorrupt.zip)  and also two text files showing the log file sizes on 4 good (able to reconnect) with checktables and 4 bad (corrupt) databases, created  with the attached reproduction and derby.storage.keepTransactionLog=true
along with badlogsizes.txt and goodlogsizes.txt showing the log file sizes.

I notice the good ones all have 7 log files and the bad ones all have 6, but even between good databases the log sizes vary somewhat.   Why the difference?

Also I looked at first corrupted database that I posted  and found the corruption in the  index TEST1_IDX_INDCOL2.

To map the index to the container number  1073, I doctored up the database so I could connect to it and then ran.
SELECT  C.CONGLOMERATENUMBER, C.CONGLOMERATENAME  FROM SYS.SYSCONGLOMERATES C WHERE CONGLOMERATENUMBER=1073;

I will check other corrupt databases and see if it is corruption of the same index and  see if the reproduction works without the index.


> corruption on z/OS with storerecovery oc_rec? tests.  ERROR XSLA7: Cannot redo operation null in the log.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4239
>                 URL: https://issues.apache.org/jira/browse/DERBY-4239
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.5.1.1
>         Environment: z/OS z10 processor. 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr4-20090219_01(SR4))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160-20090215_29883 (JIT enabled, AOT enabled)
> J9VM - 20090215_029883_bHdSMr
> JIT  - r9_20090213_2028
> GC   - 20090213_AA)
> JCL  - 20090218_01
> also 
> java version "1.6.0"
> Java(TM) SE Runtime Environment (build pmz3160sr2ifix-20081021_01(SR2+IZ32776+IZ33456))
> IBM J9 VM (build 2.4, J2RE 1.6.0 IBM J9 2.4 z/OS s390-31 jvmmz3160ifx-20081010_24288 (JIT enabled, AOT enabled)
> J9VM - 20081009_024288_bHdSMr
> JIT  - r9_20080721_1330ifx2
> GC   - 20080724_AA)
> JCL  - 20080808_02
>            Reporter: Kathey Marsden
>            Priority: Critical
>         Attachments: badlogsizes.txt, derby.log, derby.log, goodlogsizes.txt, reproDerby4239.zip, wombat_keeplog_notcorrupt.zip, wombat_with_keeplog.zip
>
>
> I saw corruption on z/OS with the storerecovery tests and 10.5.1.1.  The failure comes in oc_rec3 trying to connect to the database, but the actual problem seems to have occurred with the prior test oc_rec2.  The problem is somewhat intermittent, happening approximately 1/4 times.  I extracted the case from the harness and will attach the reproduction and run the script repro.ksh.  The script will loop up to 50 times until it gets the failure which looks like.
> ERROR XSLA7: Cannot redo operation null in the log.
> 	at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.redo(Unknown Source)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.recover(Unknown Source)
> 	at org.apache.derby.impl.store.raw.RawStore.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.store.access.RAMAccessManager.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startModule(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.bootServiceModule(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.bootStore(Unknown Source)
> 	at org.apache.derby.impl.db.BasicDatabase.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.boot(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.TopService.bootModule(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startProviderService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.findProviderAndStartService(Unknown Source)
> 	at org.apache.derby.impl.services.monitor.BaseMonitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.iapi.services.monitor.Monitor.startPersistentService(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.bootDatabase(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
> 	at org.apache.derby.jdbc.Driver40.getNewEmbedConnection(Unknown Source)
> 	at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
> 	at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:311)
> 	at java.sql.DriverManager.getConnection(DriverManager.java:268)
> 	at CheckTables.main(CheckTables.java:8)
> Caused by: ERROR XSDBB: Unknown page format at page Page(16,Container(0, 1073)), page dump follows: Hex dump:
> 00000000: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> <snip lots of 000's>
> I ran it with 10.3 and it completed all 50 iterations, so whether JVM or Derby issue it seems new since 10.3. (I haven't tried with 10.4).  Oddly I have run tests many times before on this machine using in the 10.5.1.1 release and the same jvm and have never seen this failure, so am looking into whether maybe something changed on the machine or environment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.