lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philip Withington" <>
Subject Limitations of Lucene with large documents
Date Mon, 31 Jul 2006 08:43:59 GMT
Hello All

I am looking for some information on the limitations of Lucene.Net.  I have
been investigating the viability of using lucene as search engine on a
collection of large documents.  There are about 30,000 documents and they
can be anything up to 5MB in size (plain text).  Although no errors occurred
while indexing the documents, the generated index did not appear to be
searchable.  I then used the Luke tool to see if I could find out why and
although the index seemed to be browseable, when trying to search I got Java
heap exceptions.

I guess the limits are going to depend a lot on the hardware being used to
host the index but does anyone have any experience or tips on getting Lucene
to work with large documents?  Also, is there any documentation on sensible
limits to what can be achieved be with Lucene or any rules of thumb as to
what you can and can't do?



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message