db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rick Hillegas (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DERBY-2254) Assert during log file switch: log file position exceeded max log file size
Date Wed, 11 May 2011 13:23:47 GMT

     [ https://issues.apache.org/jira/browse/DERBY-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rick Hillegas updated DERBY-2254:
---------------------------------

    Attachment: derby-2254-01-ab-accountForMoreOverhead.diff

Attaching derby-2254-01-ab-accountForMoreOverhead.diff, a candidate fix for this bug. After
applying this patch, I can't reproduce the bug using the attached repro. I have run the repro
for more than 1500 iterations.

This is my understanding of what is going on here:

A) The original assertion is supposed to check that you are not creating an illegal log instant.
A log instant consists of a log file number and an offset into that file. A log instant is
illegal if the file number or offset are too big. In this particular case, the assertion fails
because the offset is too big.

B) The offset is the actual end of the current log file. That is, the log file has already
spilled past its legal end.

C) The assertion is triggered after the log writer notices that the log file needs to be switched.
The log writer ought to detect this condition and switch the log file BEFORE writing a log
record which spills over the end. Unfortunately, the log writer only notices this condition
AFTER a log record has been written which straddles the legal end of file.

D) We check for whether we should switch by asking whether the remaining bytes we need to
write will spill past the legal end of the file. To compute the number of bytes we expect
to write, we add the length of the log wrapper overhead to the length of the current log record.
However, it turns out that may need to write more bytes than that. The calculation does not
include an extra four bytes needed to write a terminating 0 to the log. The calculation also
does not include the bytes needed to write a checksum record.

E) If we are already very close to the legal end of the log file, then we could spill past
the end after writing the wrapper, the log record, the terminating 0, and the checksum record.
I suppose that we get here if the checkpointing thread can't get in often enough to switch
log files before we encroach on the end of the current file.

F) The fix is to require that the log accomodate that extra overhead (the terminating 0 and
the checksum record). Now we force a log switch if there is not enough room to write the current
log record, its wrapper, the terminating 0, and the checksum record.


I have made 3 changes:

1) The actual change to LogToFile to increase the size of the overhead space we require to
avoid a log switch.

2) All of the places in LogToFile which update the remembered endPostion have been reworked
to call a brief method which sanity checks whether we have spilled past the end of the file
already. I put in this instrumentation in order to find where the extra bytes were being written.
I think it is a useful sanity check but I can remove it if people think that it will slow
down log processing significantly.

3) I added a little more information to the original assertion and its friends in LogCounter.


Touches the following files:

----------

M      java/engine/org/apache/derby/impl/store/raw/log/LogAccessFile.java

Exposed the length of the checksum record so that LogToFile can accurately calculate its overhead.

----------

M      java/engine/org/apache/derby/impl/store/raw/log/LogCounter.java

Items (1) and (2) as described above.

----------

M      java/engine/org/apache/derby/impl/store/raw/log/LogToFile.java

Item (3) above.


> Assert during log file switch: log file position exceeded max log file size
> ---------------------------------------------------------------------------
>
>                 Key: DERBY-2254
>                 URL: https://issues.apache.org/jira/browse/DERBY-2254
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.3.1.4
>         Environment: Solaris 10, Java SE 6 build 104 
>            Reporter: Olav Sandstaa
>            Assignee: Rick Hillegas
>              Labels: derby_triage10_5_2
>         Attachments: d2254.zip, derby-2254-01-ab-accountForMoreOverhead.diff
>
>
> When running simple tpc-b like transactions against a embedded Derby based on a SANE
build of trunk the following assertion occurs for the background thread and all user threads:
>    org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED log file position
exceeded max log file size
> This seems to occur during a switch to a new log file.
> derby.log contains the following call stack for the background thread:
> Exception trace: 
> org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED log file position
exceeded max log file size
> 	at org.apache.derby.shared.common.sanity.SanityManager.ASSERT(SanityManager.java:120)
> 	at org.apache.derby.impl.store.raw.log.LogCounter.makeLogInstantAsLong(LogCounter.java:120)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.switchLogFile(LogToFile.java:1900)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.appendLogRecord(LogToFile.java:3530)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.logAndDo(FileLogger.java:345)
> 	at org.apache.derby.impl.store.raw.xact.Xact.logAndDo(Xact.java:1185)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.checkpointWithTran(LogToFile.java:1540)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.checkpoint(LogToFile.java:1357)
> 	at org.apache.derby.impl.store.raw.RawStore.checkpoint(RawStore.java:439)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.performWork(LogToFile.java:3416)
> 	at org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(BasicDaemon.java:331)
> 	at org.apache.derby.impl.services.daemon.BasicDaemon.work(BasicDaemon.java:668)
> 	at org.apache.derby.impl.services.daemon.BasicDaemon.run(BasicDaemon.java:394)
> 	at java.lang.Thread.run(Thread.java:619)
> 2007-01-17 23:09:48.638 GMT Thread[derby.rawStoreDaemon,5,derby.daemons] Cleanup action
starting
> org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED log file position
exceeded max log file size
> 	at org.apache.derby.shared.common.sanity.SanityManager.ASSERT(SanityManager.java:120)
> 	at org.apache.derby.impl.store.raw.log.LogCounter.makeLogInstantAsLong(LogCounter.java:120)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.switchLogFile(LogToFile.java:1900)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.appendLogRecord(LogToFile.java:3530)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.logAndDo(FileLogger.java:345)
> 	at org.apache.derby.impl.store.raw.xact.Xact.logAndDo(Xact.java:1185)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.checkpointWithTran(LogToFile.java:1540)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.checkpoint(LogToFile.java:1357)
> 	at org.apache.derby.impl.store.raw.RawStore.checkpoint(RawStore.java:439)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.performWork(LogToFile.java:3416)
> 	at org.apache.derby.impl.services.daemon.BasicDaemon.serviceClient(BasicDaemon.java:331)
> 	at org.apache.derby.impl.services.daemon.BasicDaemon.work(BasicDaemon.java:668)
> 	at org.apache.derby.impl.services.daemon.BasicDaemon.run(BasicDaemon.java:394)
> 	at java.lang.Thread.run(Thread.java:619)
> Cleanup action completed
> For my user threads the call stack is similar:
> Database Class Loader started - derby.database.classpath=''
> 2007-01-17 23:09:36.401 GMT Thread[Thread-51,5,main] (XID = 12632406), (SESSIONID = 51),
(DATABASE = /export/home/tmp/derby-db), (DRDAID = null), Cleanup action starting
> 2007-01-17 23:09:36.401 GMT Thread[Thread-51,5,main] (XID = 12632406), (SESSIONID = 51),
(DATABASE = /export/home/tmp/derby-db), (DRDAID = null), Failed Statement is: UPDATE accounts
SET abal = abal + ? WHERE aid = ? AND bid = ?
> org.apache.derby.shared.common.sanity.AssertFailure: ASSERT FAILED log file position
exceeded max log file size
> 	at org.apache.derby.shared.common.sanity.SanityManager.ASSERT(SanityManager.java:120)
> 	at org.apache.derby.impl.store.raw.log.LogCounter.makeLogInstantAsLong(LogCounter.java:120)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.switchLogFile(LogToFile.java:1900)
> 	at org.apache.derby.impl.store.raw.log.LogToFile.appendLogRecord(LogToFile.java:3530)
> 	at org.apache.derby.impl.store.raw.log.FileLogger.logAndDo(FileLogger.java:345)
> 	at org.apache.derby.impl.store.raw.xact.Xact.logAndDo(Xact.java:1185)
> 	at org.apache.derby.impl.store.raw.data.LoggableActions.doAction(LoggableActions.java:221)
> 	at org.apache.derby.impl.store.raw.data.LoggableActions.actionUpdate(LoggableActions.java:85)
> 	at org.apache.derby.impl.store.raw.data.StoredPage.doUpdateAtSlot(StoredPage.java:8521)
> 	at org.apache.derby.impl.store.raw.data.BasePage.updateAtSlot(BasePage.java:1108)
> 	at org.apache.derby.impl.store.access.conglomerate.GenericConglomerateController.replace(GenericConglomerateController.java:479)
> 	at org.apache.derby.impl.sql.execute.RowChangerImpl.updateRow(RowChangerImpl.java:523)
> 	at org.apache.derby.impl.sql.execute.UpdateResultSet.collectAffectedRows(UpdateResultSet.java:566)
> 	at org.apache.derby.impl.sql.execute.UpdateResultSet.open(UpdateResultSet.java:260)
> 	at org.apache.derby.impl.sql.GenericPreparedStatement.execute(GenericPreparedStatement.java:358)
> 	at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(EmbedStatement.java:1182)
> 	at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(EmbedPreparedStatement.java:1652)
> 	at org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(EmbedPreparedStatement.java:299)
> 	at com.sun.derby.perf.clients.tpcb.DBConnection.performTransaction(DBConnection.java:595)
> 	at com.sun.derby.perf.clients.tpcb.Client.run(Client.java:218)
> After this it seems like no user threads are able to connect to the database (not surpricing).
> This happend using a SANE build and I am unsure by just looking at the assertion in the
code whether this situation would have been fatal also if I had been using an INSANE build.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message