lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "m.harig" <>
Subject Re: Read large size index
Date Tue, 30 Jun 2009 12:30:41 GMT

Hi there,

On Tue, Jun 30, 2009 at 12:41 PM, m.harig<> wrote:
> Thanks Simon ,
>          Its working now , thanks a lot , i've a doubt
>       i've got 30,000 pdf files indexed ,  but if i use the code which you
> sent , returns only 200 results , because am setting   TopDocs topDocs =
>,200);  as i said if use Integer.MAX_VALUE , it
> returns
> java heap space error , even i can't use 300 ,
The Integer.MAX_VALUE was my fault. Internally lucene allocates an
array of the size n (,n)) even if your query only
returns 1 document. This causes the OOM. Only get as many results as
you need!

In turn is iterating and loading of all those documents necessary?

no need to iterate all documents , i set,10000) , am
getting the results , 

What is your usecase of lucene where you have to load 30k of
documents? You have to be aware of that if you load 30k docs you need
enough memory for them in you JVM. I have no idea how you index and
what you store in the index but 30k pdf with -Xmx128M is not much :)

is there any way to get the total hits from the index when i search for a
keyword? i mean i set TopDocCollector collector = new
TopDocCollector(10000); , so the results will not exceed more than 10k ,
what am asking is i need to display the total hits from the index , it might
be more than 10k , like google did ,  Results 1 - 10 of about 51,200 , can
you please tell me..


View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message