lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Diviacco <patrick.divia...@gmail.com>
Subject Re: how to get all documents in the results ?
Date Wed, 23 Mar 2011 07:44:49 GMT
The issue with


My confusion about MatchAllDocsQuery is that I cannot specify which terms in
which fields to search with it. I'm probably wrong.

I currently have a BooleanQuery, that I use to build the query with several
fields and several terms.

Can I just pass MatchAllDocsQuery to BooleanQuery.add method in order to add
all remaining docs ?

thanks




On 22 March 2011 12:42, Anshum <anshumg@gmail.com> wrote:

> MatchAllDocs does not consider only a single field but all fields i.e. it
> takes a *:* query.
>
> *1. *
> *****Snip****
> Query query = new MatchAllDocsQuery();
> TopDocs  td = is.search(query, ir.numDocs());
> ScoreDoc[ ] scoreDocs = td.scoreDocs;
> for(ScoreDoc scoreDoc:scoreDocs){
>
> ... Your code...
>
> }
> ****/Snip***
>
> *2. Collector approach:*
>
> Query query = new MatchAllDocsQuery();
> int hitsPerPage = ir.numDocs();
> TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage,
> true);
> is.search(query, collector);
> ScoreDoc[] scoreDocs = collector.topDocs().scoreDocs;
> for(ScoreDoc scoreDoc:scoreDocs){
>
> .. Your code..
>
> }
>
> The better part about using a collector based approach is that you could do
> a lot in the collect step i.e. everything that you'd do in the iteration
> post search could also be done while collecting here.
>
> Also, could you also tell me the exact case as to what is it that you are
> trying to achieve. You may have a completely different option that you
> haven't read which someone could advice if they know the exact intent.
>
> Hope this helps.
>
> --
> Anshum Gupta
> http://ai-cafe.blogspot.com
>
>
> On Tue, Mar 22, 2011 at 4:59 PM, Patrick Diviacco <
> patrick.diviacco@gmail.com> wrote:
>
> > 1. "all" docs
> >
> > 2. because matchalldocs only consider one field at once. I'm searching
> over
> > multiple fields instead.
> >
> > 3. could you tell me more about this ? It might be a solution!
> >
> >
> >
> > On 22 March 2011 12:18, Anshum <anshumg@gmail.com> wrote:
> >
> > > so a few things
> > > 1. are you looking to get 'all' documents or only docs matching your
> > query?
> > > 2. if its about fetching all docs, why not use the matchalldocs query?
> > > 3. did you try using a collector instead of topdocs?
> > >
> > > --
> > > Anshum Gupta
> > > http://ai-cafe.blogspot.com
> > >
> > >
> > > On Tue, Mar 22, 2011 at 4:46 PM, Patrick Diviacco <
> > > patrick.diviacco@gmail.com> wrote:
> > >
> > > > I don't think the link you suggested can help, but maybe I'm wrong.
> > > >
> > > > Also, the parameter MAX_HITS is not useful, it just limit the
> results,
> > it
> > > > doesn't add the not relevant docs.
> > > >
> > > >
> > > >
> > > > On 22 March 2011 12:10, Anshum <anshumg@gmail.com> wrote:
> > > >
> > > > > Hi Patrick,
> > > > > You may have a look at this, perhaps this will help you with it.
> Let
> > me
> > > > > know
> > > > > if you're still stuck up.
> > > > >
> > > >
> > >
> >
> http://stackoverflow.com/questions/3300265/lucene-3-iterating-over-all-hits
> > > > >
> > > > >
> > > > > --
> > > > > Anshum Gupta
> > > > > http://ai-cafe.blogspot.com
> > > > >
> > > > >
> > > > > On Tue, Mar 22, 2011 at 4:10 PM, <karl.wright@nokia.com> wrote:
> > > > >
> > > > > > Not sure what your use case actually is, but it sounds like
you
> may
> > > be
> > > > > > unclear how Lucene works.
> > > > > >
> > > > > > Each query clause you have will produce an iterator that walks
> over
> > > the
> > > > > > documents that match that clause.  All the documents from the
> > entire,
> > > > > root
> > > > > > query get scored.  The scoring evaluation per document is also
> > > related
> > > > to
> > > > > > the form of your query expression hierarchy.
> > > > > >
> > > > > > So, MatchAllDocsQuery is exactly what you want if you want a
> > document
> > > > > > iterator that includes all documents in the index.  You can
> change
> > > how
> > > > > this
> > > > > > is scored by extending MatchAllDocsQuery and writing a custom
> > scorer.
> > > > > >
> > > > > > Karl
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: ext Patrick Diviacco [mailto:patrick.diviacco@gmail.com]
> > > > > > Sent: Tuesday, March 22, 2011 4:23 AM
> > > > > > To: java-user@lucene.apache.org
> > > > > > Subject: how to get all documents in the results ?
> > > > > >
> > > > > > I'm using the following code because I want to see the entire
> > > > collection
> > > > > in
> > > > > > my query results:
> > > > > >
> > > > > > //adding wildcards-term to see all results
> > > > > > rest = new TermQuery(new Term("*","*"));
> > > > > > booleanQuery.add(rest, BooleanClause.Occur.SHOULD);
> > > > > >
> > > > > > But it doesn't work, I only see the relevant docs and not all
the
> > > other
> > > > > > ones.
> > > > > > How can I get all documents ordered by relevance instead ?
> > > > > >
> > > > > > ps. MatchAllDocsQuery is not a solution because I need to specify
> > my
> > > > own
> > > > > > custom query.
> > > > > >
> > > > > >
> > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > > For additional commands, e-mail:
> java-user-help@lucene.apache.org
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message