lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: how to get all the docIds in the search result?
Date Thu, 23 Jul 2009 16:06:31 GMT
And if I may add another thing - if you are using Solr in this fashion, have a look at your
caches, esp. document cache. If your "queries" of this type are repeated, you may benefit
from large cache.  Or, if they are not, you may completely disable some caches.

 Otis
--
Sematext is hiring: http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Erik Hatcher <erik@ehatchersolutions.com>
> To: solr-user@lucene.apache.org
> Sent: Thursday, July 23, 2009 11:15:45 AM
> Subject: Re: how to get all the docIds in the search result?
> 
> Rather than trying to get all document id's in one call to Solr, consider paging 
> through the results.  Set rows=1000 or probably larger, then check the numFound 
> and continue making requests to Solr incrementing start parameter accordingly 
> until done.
> 
>     Erik
> 
> On Jul 23, 2009, at 5:35 AM, shb wrote:
> 
> > I have tried the following code:
> > query.setRows(Integer.MAX_VALUE);
> > query.setFields("id");
> > 
> > when it return 1000,000 records, it will take about 22s.
> > This is very slow. Is there any other way?
> > 
> > 
> > 2009/7/23 Toby Cole 
> > 
> >> Have you tried limiting the fields that you're requesting to just the ID?
> >> Something along the line of:
> >> 
> >> query.setRows(Integer.MAX_VALUE);
> >> query.setFields("id");
> >> 
> >> Might speed the query up a little.
> >> 
> >> 
> >> On 23 Jul 2009, at 09:11, shb wrote:
> >> 
> >> Here id is indeed the uniqueKey of a document.
> >>> I want to get all the ids  for some other  useage.
> >>> 
> >>> 
> >>> 2009/7/23 Shalin Shekhar Mangar 
> >>> 
> >>> On Thu, Jul 23, 2009 at 1:09 PM, shb wrote:
> >>>> 
> >>>> if I use query.setRows(Integer.MAX_VALUE);
> >>>>> the query will become very slow, because searcher will go
> >>>>> to fetch the filed value in the index for all the returned
> >>>>> document.
> >>>>> 
> >>>>> So if I set query.setRows(10), is there any other ways to
> >>>>> get all the ids? thanks
> >>>>> 
> >>>>> 
> >>>> You should fetch as many rows as you need and not more. Why do you need
> >>>> all
> >>>> the ids? I'm assuming that by id you mean the uniqueKey of a document.
> >>>> 
> >>>> --
> >>>> Regards,
> >>>> Shalin Shekhar Mangar.
> >>>> 
> >>>> 
> >> --
> >> 
> >> Toby Cole
> >> Software Engineer, Semantico Limited
> >> 
> >> Registered in England and Wales no. 03841410, VAT no. GB-744614334.
> >> Registered office Lees House, 21-23 Dyke Road, Brighton BN1 3FE, UK.
> >> 
> >> Check out all our latest news and thinking on the Discovery blog
> >> http://blogs.semantico.com/discovery-blog/
> >> 
> >> 


Mime
View raw message