db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Pendleton <bpendleton.de...@gmail.com>
Subject Re: Derby received an error "ERROR XSDG0: Page Page(1325564,Container(0, 30832)) could not be read from disk."
Date Fri, 04 Sep 2015 00:33:04 GMT
> ERROR XSDG0: Page Page(1325564,Container(0, 30832)) could not be read from disk.
> Caused by: java.io.EOFException: Reached end of file while attempting to read a whole

Does the derby.log have any more detail about this specific exception?

Note that you can use the system tables (SYSCONGLOMERATES, I believe)
to figure out which table corresponds to conglomerate 30832, and you
can also multiply 1325564 by the pagesize of your table to figure out
what the file size was at the instant that this happened.

Assuming your page size was 4096, 1325564 * 4096 is 5,429,510,144, so
that conglomerate should be about 5.4 GB in size.

> derby the reported errors like:
> org.apache.derby.iapi.error.ShutdownException:

This is normal I believe.

> java.lang.NullPointerException
>          at org.apache.derby.impl.drda.DRDAConnThread.writePBSD(Unknown Source)
>          at org.apache.derby.impl.drda.DRDAConnThread.processCommands(Unknown Source)
>          at org.apache.derby.impl.drda.DRDAConnThread.run(Unknown Source)

This is scary, but it appears to have happened AFTER the shutdown, and hence
may be some secondary, unrelated bug in the network server code related to
not handling a shutdown correctly. It seems worth investigating separately.

> The system is an Oracle M5000 Enterprise server with what I believe is a 15TB Sun ZFS
Storage 7320 external ZFS storage array connected by Fibre Channel.   This is the first time
in over 8 years we have seen any I/O error like such.
> What I am trying to confirm is that this is really low level derby code that if it reports
an “java.io.EOFException” like it did, it really did have an I/O error somewhere in reading
the page from the container file.   Things like performance, java heap
> space, etc, can pretty much be ruled out as causing such an error.   My gut feeling is
that maybe something in the connection to this storage array had a hiccup.   This setup is
at the customer site and I cannot directly access system logs nor do I have
> knowledge on how this storage array works and how to look at such but just having confirmation
that an I/O error really did occur would help.

This is good information to have.

My feeling is that you should do a more thorough investigation of the
specific conglomerate in question, to check for errors that might
not be showing up using your regular application access patterns.

Also, if you can find any more information in the derby log, it would
be nice to know.

Thanks for sharing the information that you do have, it is quite
interesting to know what your experience is!


P.S. I believe this is the code that threw the java.io.EOFException:

      * Attempts to fill buf completely from start until it's full.
      * <p/>
      * FileChannel has no readFull() method, so we roll our own.
      * <p/>
      * @param dstBuffer buffer to read into
      * @param srcChannel channel to read from
      * @param position file position from where to read
      * @throws IOException if an I/O error occurs while reading
      * @throws StandardException If thread is interrupted.
     private void readFull(ByteBuffer dstBuffer,
                           FileChannel srcChannel,
                           long position)
             throws IOException, StandardException
         while(dstBuffer.remaining() > 0) {
             if (srcChannel.read(dstBuffer,
                                     position + dstBuffer.position()) == -1) {
                 throw new EOFException(
                     "Reached end of file while attempting to read a "
                     + "whole page.");

             // (**) Sun Java NIO is weird: it can close the channel due to an
             // interrupt without throwing if bytes got transferred. Compensate,
             // so we can clean up.  Bug 6979009,
             // http://bugs.sun.com/view_bug.do?bug_id=6979009
             if (Thread.currentThread().isInterrupted() &&
                     !srcChannel.isOpen()) {
                 throw new ClosedByInterruptException();

View raw message