db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <mse...@segel.com>
Subject Re: order by
Date Fri, 03 Mar 2006 13:24:48 GMT
On Friday 03 March 2006 5:04 am, Thomas Vatter wrote:
> I have shutdown derby, dropped the database by removing its folder,
> started derby (no large memory consuming, stays low), recreated
> the database, imported 1300 data and showed them in my application.
> Memory stays low, but the ordering is not ok. This is a mystery for me.


Since you mentioned Linux/Unix... try looking at the data files using od.
od (octal dump???) is a great tool for looking at the low level data.
You can look up the man page on it, but I'd suggest using -a option so that 
the output is in ascii.

There's definitely a problem with your insert/load code.
You should not be consuming that much memory.

At the start of your load program, when you establish your connection,
you should be instantiating a single prepared statement to perform the insert.
Then you should open the file as an input stream so you can use get line.

Then you'll have to write your own parsing program or rather parsing routine 
since the split() method of String doesn't take in to consideration if a 
comma occurs within a quoted value. "Smith, John" will split in to "Smith
and then John".

The other thing that you need to do is to remove any leading /trailing white 
space. (This may remove any non-printable garbage too but I'm not sure. Of 
course, when in doubt, write your own....)

I would also suggest that you create your own container class to store each 
data element  on to a vector as you parse the row. (note: you could just pop 
the string on to the vector, but depending on complexity of the load program, 
you may want to capture additional metadata like mapping position, data 
types, etc ...)

The bottom line, a load program should be extremely memory efficient program.



View raw message