lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: Read large size index
Date Tue, 30 Jun 2009 12:37:50 GMT
On Tue, Jun 30, 2009 at 2:30 PM, m.harig<m.harig@gmail.com> wrote:
>
>
>
> Hi there,
>
> On Tue, Jun 30, 2009 at 12:41 PM, m.harig<m.harig@gmail.com> wrote:
>>
>> Thanks Simon ,
>>
>>          Its working now , thanks a lot , i've a doubt
>>
>>       i've got 30,000 pdf files indexed ,  but if i use the code which you
>> sent , returns only 200 results , because am setting   TopDocs topDocs =
>> searcher.search(query,200);  as i said if use Integer.MAX_VALUE , it
>> returns
>> java heap space error , even i can't use 300 ,
> The Integer.MAX_VALUE was my fault. Internally lucene allocates an
> array of the size n (searcher.search(query,n)) even if your query only
> returns 1 document. This causes the OOM. Only get as many results as
> you need!
>
> In turn is iterating and loading of all those documents necessary?
>
> no need to iterate all documents , i set searcher.search(query,10000) , am
> getting the results ,
>
> What is your usecase of lucene where you have to load 30k of
> documents? You have to be aware of that if you load 30k docs you need
> enough memory for them in you JVM. I have no idea how you index and
> what you store in the index but 30k pdf with -Xmx128M is not much :)
>
> is there any way to get the total hits from the index when i search for a
> keyword? i mean i set TopDocCollector collector = new
> TopDocCollector(10000); , so the results will not exceed more than 10k ,
> what am asking is i need to display the total hits from the index , it might
> be more than 10k , like google did ,  Results 1 - 10 of about 51,200 , can
> you please tell me..

TopDoc returned by several Searcher#search methods has a field
TopDocs#totalHits thats the total number of hits for a query!

simon
>
> simon
>
> --
> View this message in context: http://www.nabble.com/Read-large-size-index-tp24251993p24271025.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message