Return-Path: Delivered-To: apmail-db-derby-user-archive@www.apache.org Received: (qmail 47748 invoked from network); 8 Oct 2007 15:18:51 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 8 Oct 2007 15:18:51 -0000 Received: (qmail 36505 invoked by uid 500); 8 Oct 2007 15:18:38 -0000 Delivered-To: apmail-db-derby-user-archive@db.apache.org Received: (qmail 36336 invoked by uid 500); 8 Oct 2007 15:18:37 -0000 Mailing-List: contact derby-user-help@db.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Reply-To: "Derby Discussion" Delivered-To: mailing list derby-user@db.apache.org Received: (qmail 36325 invoked by uid 99); 8 Oct 2007 15:18:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Oct 2007 08:18:37 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of gcaddu-derby-user@m.gmane.org designates 80.91.229.2 as permitted sender) Received: from [80.91.229.2] (HELO ciao.gmane.org) (80.91.229.2) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Oct 2007 15:18:37 +0000 Received: from root by ciao.gmane.org with local (Exim 4.43) id 1IeuJy-0005kZ-ME for derby-user@db.apache.org; Mon, 08 Oct 2007 15:15:02 +0000 Received: from hall.fysik.dtu.dk ([130.225.86.149]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Oct 2007 15:15:02 +0000 Received: from ture.munter by hall.fysik.dtu.dk with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Oct 2007 15:15:02 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: derby-user@db.apache.org From: Ture Munter Subject: Re: Inserting data into a database on a Derby network server Date: Mon, 8 Oct 2007 14:35:20 +0000 (UTC) Lines: 103 Message-ID: References: <1191836883.19938.22.camel@hall.fysik.dtu.dk> <470A109F.9090800@Sun.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: main.gmane.org User-Agent: Loom/3.14 (http://gmane.org/) X-Loom-IP: 130.225.86.149 (Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.12) Gecko/20070719 CentOS/1.5.0.12-3.el5.centos Firefox/1.5.0.12) Sender: news X-Virus-Checked: Checked by ClamAV on apache.org Kristian Waagan writes: > > Ture Munter wrote: > > I'm using embedded derby in a Java SE application, however I use Derby > > as a network server while building the database. In that connection I > > ran into problems. > > I have a database with 65000 rows that I wanted to add to a Derby > > database (total size before insert around 10000 records, 30MB). Each row > > consist of some strings and some floats (6 fields in total) and the > > operation requires two inserts and one select. During insert I run Derby > > as a Derby network server, the problem is that no matter what I do I get > > OutOfMemory exceptions (Derby ran out of heap space) all the time. > > Setting the max memory use of the JVM to 1GB allowed me to add around > > 17000 rows, but not anymore than that. > > Hello Ture, Hello Kristian, Thanks for the quick response, I'm very impressed :-) > > Is this a out-of-the-box configuration, or have you configured Derby by > specifying properties? > If you have tuned it, have you set the Derby page cache size? > (derby.system.pageCacheSize). > The page size? > Is there only one connection inserting data? I use Derby in the out-of-the-box configuration, it worked fine and performance was good until this problem showed up. I'm going to use the database in read-only mode and in the embedded configuration. I forgot to tell that I'm using Derby 10.3.1.4 and JavaSE JDK 1.5.0_12. > Sounds to me as if you have a program generating the data. Is this > something you could share with the community? > That would be great, then people could run it and see if they see the > same behavior. Also, if there's a bug it will be easier to find it if we > have a running repro. All the data are read in from a text-file, parsed and inserted into the database. As the data are published scientific data I can share my source code and the data. Everything is available on www.fysik.dtu.dk/~munter/derby/ There are a few comments in the source-code that should explain what happens including the SQL to create the needed tables. Everything should be pretty self-explanatory as it only reads the text-file line-by-line and inserts them into the database. In order to get it to work, I am constantly closing the connection and reconnecting to the network server, this is done every 300 rows. This happens around line 215. If this with closing/opening the connection is disabled and Derby runs as network server (and the database is empty), it runs out of memory after 6956 rows (on my machine with 2GB of memory). If closing/opening the connection is enabled it first runs out of memory after 16198 rows (when restarting with the existing database w 6956 rows). Maybe it would work better/without problems if I used the methods to get the last auto-generated key - but it shouldn't be necessary to be so careful in order not to "break" anythings :-) > If that is not possible, can you run with Java SE 6 and tell it to > dump the heap on OOME? > Another possibility is to use jmap to get a histogram of the > heap-content (add the option to only get live objects as well), and > this would be easy to post. I haven't tried any of those things yet, maybe I can find time for that tomorrow, but for now have fun with the program I uploaded :-) > > The problem was "solved" by running Derby embedded in the program that > > inserted data into the database and only commit every 50th row. When > > running Derby embedded I didn't run out of memory. After adding all > > 65000 records I can still run Derby as network server and do queries, > > but there are problems when trying to insert data. There are no problems > > when running Derby embedded. Have anybody experienced the same > > problem/features? > > > > I also experience a decrease in performance after having inserted many > > rows, but the slow-down is not as bad as described in the thread > > "exponential increase in insert time". > > Just out of curiosity, are you using Statement.executeUpdate, prepared > statements or batching when inserting the data? A little executeUpdate and some PreparedStatements and first calling commit after 50 rows have been inserted. Thanks, Ture Munter