db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Matrigali (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-3607) Invalid checksum error in Derby 10.3.2.1
Date Fri, 13 Jun 2008 17:55:45 GMT

     [ https://issues.apache.org/jira/browse/DERBY-3607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mike Matrigali updated DERBY-3607:
----------------------------------


Thanks for the info, without a repro I have been inspecting the code and anything that you
can tell me about the app
helps direct that effort.  Of course if you can come up with a repro that could be run it
is much more likely someone will
be able to find the issue.  If you can't give a repro I am going to continue to ask questions
about the app in your environment, 
and maybe something will come of that.

How big is a zipped copy of the db?  Getting a copy of it may help as it would be interesting
to look at the state of page 0 in
the 2 corrupted tables.  jira will allow up to 10mb of a zipped attachment.  there may be
other places at a apache we could 
load a bigger file, just not sure right now.  depending on when you enabled online backup
the log may include a complete
update history of the db and looking at it or more likely comparing it with a few other examples
of the bug may lead to what
kinds of things lead to the problem.

My current assumption is that the problem is caused by some I/O interaction similar to DERBY-3347.
 Could you describe 
the concurrency in the simplest case that you have been able to reproduce this problem.  Basically
how many threads
are involved and do they run concurrently?  Things like are the startup/shutdown of the 2
db's done independently on different
threads?  While running how many threads/connections are done doing work in the 2 db's.  

Can you describe when and how often you execute the command that "enables" online backup?
 It is at this point that derby
copies a number of database files from the original db to your backup location and it does
have some code that insures that
the page 0 is up to date before the copy.  Is it possible to run your test without online
backup ever being called just to see it
the bug still reproduces?  

You mentioned "jvm hook", does this mean you are also shutting down the jvm durring the test
run?  If so can you describe
how often, ie. for every shutdown in the derby.log is there also a jvm shutdown?  I think
this is what you mean by your
services comment.  Is the following what is going on:
o You start a service and what it does is start up a jvm, it opens the 2 databases, and some
set of work is somehow done
in both the db's (is this work different or the same for each try).  Then you stop the service
sometime later which shuts down
the jvm and as part of jvm hooks your specific shutdown stuff happens.

How are clients shutdown?  Is it possible that in progress clients are shutdown by "killing"
them somehow?  I am not 
familar with hibernate so this could be a part of "the seesion factories (hibernate level)
 shutdown" ?

Is any of the work that is done during the tests include deletes, updates to fields included
in any index/contraint, or inserts which fail due to duplicate key errors, ddl after the initial
create of the database?  This is interesting as it may queue background work to reclaim deleted
space and thus add another point of concurrency to the work being done.  

Does each startup/shutdown phase always access the same tables, looking to verify that when
you get an error on a particular
table whether it is guaranteed that the table was working in the previous startup/shutdown
test phase.  

Another thing I am looking at is a possiblity that exaustion of resources is coming into play,
which would explain why it takes multiple db's.  Do you set any derby properties as part of
your application, if so could you post all the ones that you change.  Can you estimate about
how many different derby tables might be accessed during one of the startup/shutdown phases
of your test?   I know this is hard as it should also include how many indexes may be referenced.
 The 2 resources I am mostly thinking about are the page cache and the open container cache.
 The page cache defaults to 1000 and the open container cache defaults to 100.  Depending
on your application (basically concurrent user threads and the background thread used for
post commit and checkpoints) we may have multiple open "channels" on each container in the
open container cache - i am not exactly sure what
resource this maps to on windows.  

> Invalid checksum error in Derby 10.3.2.1
> ----------------------------------------
>
>                 Key: DERBY-3607
>                 URL: https://issues.apache.org/jira/browse/DERBY-3607
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.3.2.1, 10.4.1.3
>         Environment: OS-WIN XP SP2, 1.86GHz, 2GB, JVM 1.5, disk caching disabled, Hibernate
3.1.1.RC3,c3p0
>            Reporter: Shahbaz
>            Priority: Critical
>         Attachments: DB_10.4logs.zip, derby.log, derby.log, hibernate.cfg.xml, hibernate.cfg.xml,
hibernate.cfg.xml
>
>
> I am getting this execption when ever I try to restart my application
> java.sql.SQLException: Invalid checksum on Page Page(0,Container(0, 2033)), expected=2,731,401,932,
on-disk version=2,375,776,513, page dump follows: Hex dump:
> 00000000: 0076 0000 0001 0000 0000 0000 0002 0000  .v..............
> 00000010: 0000 0006 0000 0000 0000 0000 0000 0000  ................
> 00000020: 0000 0000 0001 0000 0000 0000 0000 0000  ................
> 00000030: 0000 0000 0000 0000 0000 0000 ffff ffff  ................
> 00000040: ffff ffff 0000 0000 0000 0000 0000 0000  ................
> 00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000060: 0000 0000 0000 0000 0000 0000 5000 0000  ............P...
> at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
> 	at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
> 	at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source)
> 	at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
> 	at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedCallableStatement.executeStatement(Unknown Source)
> 	at org.apache.derby.impl.jdbc.EmbedPreparedStatement.execute(Unknown Source)
> 	at com.mchange.v2.c3p0.impl.NewProxyCallableStatement.execute(NewProxyCallableStatement.java:3044)
> 	at ae.sphere.arena.database.management.backup.BackupStategy.createBackup(BackupStategy.java:56)
> 	at ae.sphere.arena.database.management.backup.BackupStategy.doSchedulerJob(BackupStategy.java:41)
> 	at ae.sphere.arena.common.jobscheduler.Scheduler$1.run(Scheduler.java:49)
> 	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:58)
> 00000070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 00000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000000a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
> 000000b0: 0000 0

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message