lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aviran Mordo" <amo...@infosciences.com>
Subject RE: Using Lucene in an multiple index/large io scenario
Date Mon, 30 Jun 2003 19:36:00 GMT
You'll probably need to optimize the index more often. This will reduce
the number of files lucene open. Also if you can merge several fields
into one, it will also reduce the number of files.

Aviran

-----Original Message-----
From: tstich@uni-mannheim.de [mailto:tstich@uni-mannheim.de] 
Sent: Monday, June 30, 2003 3:54 PM
To: lucene-user@jakarta.apache.org
Subject: Using Lucene in an multiple index/large io scenario


Hello,
i am ProjectManager from the columba.sourceforge.net java
mailclient-project and we integrated Lucene as the search-backend half a
year ago. It is now working for small scale mailtraffic but with
increasing mailtraffic Lucene throws OutOfMemory and
TooManyFilesOpen-Exceptions. I am now wondering if Lucene is capable of
doing the job for us (like Otis Gospodnetic suggested) and would
appreciate any help and knowledge you can share on this topic.

I think the problem arises from following issues:
- Lucene is designed to create an index once in a while and not to
update an index frequently.
   We need it to add and delete documents very often *and* search the
index eventualy after
   every operation. Has anyone experiences running Lucene in such an
environment or do you
   think it is impossible? 

- Do you have an suggestion on how to use Lucene in such an environment
because it is not 
   very nice code if you have to create a new IndexReader/Writer after
every operation?

- We introduced a RAMIndex that is merged to the FileIndex after N
operations to reduce the
  load and to not merge documents that are removed directly after they
are added (with filters
  on the mailboxes that is happening very often). Any ideas if that was
wise or if there is a
  better solution?

- Does Lucene have problems with many indices in the same virtual
machine? We have an index
   for every mailfolder and get TooManyFilesOpen-Excpetions when having
>10 indices open.
   Maybe we should try to have only a single index that holds all
messages? 

If you like to look at sourcecode, how we implememted all this look at
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/columba/columba/src/mail/
core/org/columba/mail/folder/search/LuceneSearchEngine.java?rev=1.7&cont
ent-type=text/vnd.viewcvs-markup

Its not nice to just give you the plain code and not the relevant
snippets, but these are more general design issues that i think are
better explained in words than in code.

I would really like to see Lucene integrated in Columba, but i had to
learn that it is no easy task, maybe an impossible one. Based on the
responses i willl decide if we continue to work with Lucene or sadly
have to drop it.

Thanks in advance
Timo Stich
tstich@users.sourceforge.net


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message