lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Liang" <>
Subject RE: best strategy to deal with large index file
Date Sat, 17 Dec 2005 03:34:35 GMT
thanks for the reply.
I'm indexing emails.  Fields are the common attribute on emails:
subject, content, attachment, message size, date, sender, recipients,
etc.  The index is a few GB.  Is there a good practice to keep the index
file size at a certain level?
when I do a search on the date field that should retrieve a lot of
records, it normally throws the exception.  
I will look at MultiSearcher.  do you think split the index file based
on date field is a good choice?  I somehow feel it requires a lot of
coding to create many indexes based on date field.

	-----Original Message----- 
	From: Dan Funk 
	Sent: Fri 12/16/2005 7:12 PM 
	Subject: Re: best strategy to deal with large index file

	Are there specific queries that cause the out of memory problem?
Or will any
	query do it?
	How large is the index?
	MultiSearcher allows you to search over multiple indexes, and is
	supported throughout the API.  How you split your indexes is
depends on what
	you want to achieve. There are many here on the list developing
indexes for
	large data sets.  Please be a little more specific on what you
are indexing,
	and how you are searching.
	On 12/16/05, Jeff Liang <> wrote:
	> Hi all,
	> my index file is huge because of large set of data.  when I do
search, I
	> get outofmemory exception sometime.  I don't know what's
usually causing
	> the outofmemory exception. Is it during the search
	> because of the index file is too big?  or because there are
too many
	> hits?  memory exception just happens when I call
	> it's also bad for backup because I can't do incremental backup
	> adding new documents since I only have one big index file.
	> What's the best strategy to deal with large index file?
what's a good
	> way to split the index file?
	> I start jvm with 800MB.
	> thanks,
	> Jeff

View raw message