tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Gainty" <>
Subject Re: Severe problems
Date Sat, 02 Sep 2006 21:26:36 GMT
Good Afternoon Peter
Sounds as if you're indexes got corrupted somehow
Take a look at
for (re)building indexes
Martin --
This email message and any files transmitted with it contain confidential
information intended only for the person(s) to whom this email message is
addressed.  If you have received this email message in error, please notify
the sender immediately by telephone or email and destroy the original
message without making a copy.  Thank you.

----- Original Message ----- 
From: "Peter Bauer" <>
To: <>; <>
Sent: Saturday, September 02, 2006 3:19 PM
Subject: Severe problems

> Hi all,
> like always, time is short and we have a very critical situation here,
> so please help to save the day once again. I have already posted this
> on a PostgreSQL mailing list, but maybe it is somehow related to
> Tomcat.
> Installation facts:
> - 4 High Availability Clusters consisting of 2 Dual Xeon Server
> machines (Dell PowerEdge 2850) using Heartbeat1. Each server has only
> one harddisk, no RAID configuration is used.
> - Each Cluster has 4 drbd Devices, one for the PostgreSQL data
> - Two of these clusters are using the same PostgreSQL installation to
> share the data, the database can be moved from one cluster to the
> other in case of failure
> - OS: Debian Sarge with postgresql-7.4.7-6sarge1
> - Two cronjobs are configured to perform a "vacuumdb --all --full"
> every 10 minutes and a "vacuumdb --all --full --analyze" every 23
> minutes
> - There are 3 applications using the PostgreSQL installation, each
> with their own database.
> - The main application is based on Tomcat 4.1-30 partly a
> Web-Application, partly a terminal login protocol based on http and
> XML, so the database access is of course done using JDBC
> - A cronjob is configured to perform a pg_dump of the main database
> every 4 hours
> Usage:
> - There are 3 tables which are heavily used by inserting and deleting
> some hundred entries per second, the others are rarely used by the
> Web-Application. One of these 3 table is the heaviest used one, also
> with locking mechanisms.
> - There are some pgsql functions which use locking mechanisms (select
> for update)
> - When a user logs in into the Web-Application, about 20 tables are
> created which contain the data used for the Web-Application. The last
> 10 versions of these 20 tables -> about 200 tables are used as old
> versions which can be used to restore previous configuration data
> (pretty bad solution). These tables were never used after their
> creation, just the oldest one is deleted if there are >10 versions
> - Unfortunately we are not using Database connection pooling
> - There are some threads which have a permanent database connection
> and perform some actions either on a regular basis or on demand.
> Problems:
> 1. The first problem we faced was that the pg_dump which is performed
> every 4 hours began to fail about 1-2 times a week because of the
> following error:
> pg_dump: attempt to lock table "tableblabla" failed: ERROR:  catalog
> is missing 5 attribute(s) for relid 44798701
> The table for which the errors occur is always one of these 200 unused
> tables containing previous configuration data. The recovery procedure
> always consists of deleting all entries with relname tableblablabla
> from the pg_class, dumping and restoring the database.
> 2. Happens about once a month: Starting with select and delete queries
> regarding the heaviest used database table, the database starts to not
> respond to different queries which leads to a unusable system. If the
> database is in such a state, it is not even possible to connect to the
> database using psql on the commandline. The recovery procedure was to
> just restart the PostgreSQL service.
> 3. There is a table which contains some configuration values which are
> never changed. A thread of the application has a permanent database
> connection reads a value from this table every 10 seconds and then
> performs some actions. Sometimes, the value cannot be read from this
> table, which means that the same SELECT statement executed every 10
> seconds delivers no result, although the data is never modified. The
> table is never subject to locking or anything like that. After some
> queries failed, suddenly the correct value is returned again.
> 4. Once, the following error prevents creation of some tables (postgresql.log):
> 2006-08-30 17:50:39 [11189] ERROR:  23505: duplicate key violates
> unique constraint "pg_class_oid_index"
> followed by
> 2006-08-30 18:00:11 [18296] ERROR:  XX000: catalog is missing 10
> attribute(s) for relid 17586461
> for every run of the vacuum cronjob. This errors occurred together
> with the freeze described in 3.
> Are there any known bugs of the used version of PostgreSQL, Tomcat,
> JDBC which could cause such problems? All these problems occur
> independently on both 2 cluster systems, so i think there is no
> hardware error.
> If you have any suggestions or ideas for debugging or what the reason
> for these problems could be, please reply as soon as possible.
> br,
> Peter
> ---------------------------------------------------------------------
> To start a new topic, e-mail:
> To unsubscribe, e-mail:
> For additional commands, e-mail:
View raw message