lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: Read large size index
Date Tue, 30 Jun 2009 11:59:35 GMT
Hi there,

On Tue, Jun 30, 2009 at 12:41 PM, m.harig<m.harig@gmail.com> wrote:
>
> Thanks Simon ,
>
>          Its working now , thanks a lot , i've a doubt
>
>       i've got 30,000 pdf files indexed ,  but if i use the code which you
> sent , returns only 200 results , because am setting   TopDocs topDocs =
> searcher.search(query,200);  as i said if use Integer.MAX_VALUE , it returns
> java heap space error , even i can't use 300 ,
The Integer.MAX_VALUE was my fault. Internally lucene allocates an
array of the size n (searcher.search(query,n)) even if your query only
returns 1 document. This causes the OOM. Only get as many results as
you need!
In turn is iterating and loading of all those documents necessary?
What is your usecase of lucene where you have to load 30k of
documents? You have to be aware of that if you load 30k docs you need
enough memory for them in you JVM. I have no idea how you index and
what you store in the index but 30k pdf with -Xmx128M is not much :)

simon
>
>    here is my code
>
>
>               IndexReader open = IndexReader.open(indexDir);
>
>                IndexSearcher searcher = new IndexSearcher(open);
>
>                final String fName = "contents";
>
>                QueryParser parser = new QueryParser(fName, new StopAnalyzer());
>                Query query = parser.parse(qryStr);
>
>                TopDocs topDocs = searcher.search(query,200);
>
>                FieldSelector selector = new FieldSelector() {
>                        public FieldSelectorResult accept(String fieldName)
{
>                                return fieldName == fName ? FieldSelectorResult.LOAD
>                                                : FieldSelectorResult.LAZY_LOAD;
>                        }
>
>
>                };
>
>                final int totalHits = topDocs.totalHits;
>                ScoreDoc[] scoreDocs = topDocs.scoreDocs;
>
>                for (int i = 0; i < totalHits; i++) {
>                        Document doc = searcher.doc(scoreDocs[i].doc, selector);
>
>                        System.out.println(doc.get("title"));
>                        System.out.println(doc.get("path"));
>
>                }
>
>
>
>                searcher.close();
>
>
> ---------------- can you please clear my doubt
> --
> View this message in context: http://www.nabble.com/Read-large-size-index-tp24251993p24269693.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message