db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hcadavid <hectorcada...@yahoo.com>
Subject Derby problem: 13GB of space with 200000 records!
Date Thu, 11 Sep 2008 12:39:38 GMT

Dear friends,

I'm using derby db to record word's frequencies from a large text corpus
with a java program. It works nice with standard statements, like: "INSERT
INTO WORDS VALUES('"+word+"',1)" (it takes 50Mb to store 400000 words), but
when I switched to prepared statements and inner statements(in order to
improve performance) and repeated the process, after few hours of processing
(200MB of plain text), the database's disk consumption gets an absurd
dimension: 13GB!, I mean, 13GB of disk space to store 400000 words (of
standard length) and its frequencies!!. What may be the problem??
the biggest file is: seg0\c3c0.dat (13GB), there are no log files problem.

Here is how I'm making insertions and updates:

	        Connection con=EmbeddedDBMSConnectionBroker.getConnection();
		PreparedStatement st=con.prepareStatement("INSERT INTO WORDS
		st.setString(1, word);
		catch(SQLIntegrityConstraintViolationException e){
			PreparedStatement ps=con.prepareStatement("update words set
frequency=((select frequency from words where word=?)+1) where word=?");
			ps.setString(1, word);
			ps.setString(2, word);

This method is used concurrently by 100 threads. Please, anyone know the
causes of this estrange Derby's behavior?? (handling GBs of disk space just
for store few words isn't reasonable!).

Thanks in advance

View this message in context: http://www.nabble.com/Derby-problem%3A-13GB-of-space-with-200000-records%21-tp19433858p19433858.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.

View raw message