lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bennett, Tony" <>
Subject What kind of System Resources are required to index 625 million row table...???
Date Mon, 15 Aug 2011 18:39:26 GMT
We are examining the possibility of using Lucene to provide Text Search 
capabilities for a 625 million row DB2 table.

The table has 6 fields, all which must be stored in the Lucene Index.  
The largest column is 229 characters, the others are 8, 12, 30, and 1....
...with an additional column that is an 8 byte integer (i.e. a 'C' long long).

We have written a test app on a development system (AIX 6.1),
and have successfully Indexed 625 million rows...
...which took about 22 hours.

When writing the "search" application... we find a simple version works, however,
if we add a Filter or a "sort" to it... we get an "out of memory" exception.

Before continuing our research, we'd like to find a way to determine 
what system resources are required to run this kind of application...???
In other words, how do we calculate the memory needs...???

Have others created a similar sized Index to run on a single "shared" server...???

Current Environment:

	Lucene Version:	3.2
	Java Version:	J2RE 6.0 IBM J9 2.4 AIX ppc64-64 build jvmap6460-20090215_29883
                        (i.e. 64 bit Java 6)
	OS:			AIX 6.1
	Platform:		PPC  (IBM P520)
	cores:		2
	Memory:		8 GB
	jvm memory:	`	-Xms4072m -Xmx4072m

Any guidance would be greatly appreciated.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message