lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshum <ansh...@gmail.com>
Subject Re: how to get all documents in the results ?
Date Wed, 23 Mar 2011 08:01:51 GMT
Hi Patrick,
You *don't* need to add a MatchAllDocs query to anything. If you just want
all docs, just pass it to the searcher.search function and you'd get all
results.
MatchAllDocs query is the same as BooleanQuery , just that MADQ matches all
docs in the index. You wouldn't need to specify anything there.
The below would work and get you all the docs in the index as the result
(provided you specify a limit high enough for the numDocs to match param)

*Query query = new MatchAllDocsQuery();*
*searcher.search(query.....);*

Hope this clarifies your doubt.

--
Anshum Gupta
http://ai-cafe.blogspot.com


On Wed, Mar 23, 2011 at 1:14 PM, Patrick Diviacco <
patrick.diviacco@gmail.com> wrote:

> The issue with
>
>
> My confusion about MatchAllDocsQuery is that I cannot specify which terms
> in
> which fields to search with it. I'm probably wrong.
>
> I currently have a BooleanQuery, that I use to build the query with several
> fields and several terms.
>
> Can I just pass MatchAllDocsQuery to BooleanQuery.add method in order to
> add
> all remaining docs ?
>
> thanks
>
>
>
>
> On 22 March 2011 12:42, Anshum <anshumg@gmail.com> wrote:
>
> > MatchAllDocs does not consider only a single field but all fields i.e. it
> > takes a *:* query.
> >
> > *1. *
> > *****Snip****
> > Query query = new MatchAllDocsQuery();
> > TopDocs  td = is.search(query, ir.numDocs());
> > ScoreDoc[ ] scoreDocs = td.scoreDocs;
> > for(ScoreDoc scoreDoc:scoreDocs){
> >
> > ... Your code...
> >
> > }
> > ****/Snip***
> >
> > *2. Collector approach:*
> >
> > Query query = new MatchAllDocsQuery();
> > int hitsPerPage = ir.numDocs();
> > TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage,
> > true);
> > is.search(query, collector);
> > ScoreDoc[] scoreDocs = collector.topDocs().scoreDocs;
> > for(ScoreDoc scoreDoc:scoreDocs){
> >
> > .. Your code..
> >
> > }
> >
> > The better part about using a collector based approach is that you could
> do
> > a lot in the collect step i.e. everything that you'd do in the iteration
> > post search could also be done while collecting here.
> >
> > Also, could you also tell me the exact case as to what is it that you are
> > trying to achieve. You may have a completely different option that you
> > haven't read which someone could advice if they know the exact intent.
> >
> > Hope this helps.
> >
> > --
> > Anshum Gupta
> > http://ai-cafe.blogspot.com
> >
> >
> > On Tue, Mar 22, 2011 at 4:59 PM, Patrick Diviacco <
> > patrick.diviacco@gmail.com> wrote:
> >
> > > 1. "all" docs
> > >
> > > 2. because matchalldocs only consider one field at once. I'm searching
> > over
> > > multiple fields instead.
> > >
> > > 3. could you tell me more about this ? It might be a solution!
> > >
> > >
> > >
> > > On 22 March 2011 12:18, Anshum <anshumg@gmail.com> wrote:
> > >
> > > > so a few things
> > > > 1. are you looking to get 'all' documents or only docs matching your
> > > query?
> > > > 2. if its about fetching all docs, why not use the matchalldocs
> query?
> > > > 3. did you try using a collector instead of topdocs?
> > > >
> > > > --
> > > > Anshum Gupta
> > > > http://ai-cafe.blogspot.com
> > > >
> > > >
> > > > On Tue, Mar 22, 2011 at 4:46 PM, Patrick Diviacco <
> > > > patrick.diviacco@gmail.com> wrote:
> > > >
> > > > > I don't think the link you suggested can help, but maybe I'm wrong.
> > > > >
> > > > > Also, the parameter MAX_HITS is not useful, it just limit the
> > results,
> > > it
> > > > > doesn't add the not relevant docs.
> > > > >
> > > > >
> > > > >
> > > > > On 22 March 2011 12:10, Anshum <anshumg@gmail.com> wrote:
> > > > >
> > > > > > Hi Patrick,
> > > > > > You may have a look at this, perhaps this will help you with
it.
> > Let
> > > me
> > > > > > know
> > > > > > if you're still stuck up.
> > > > > >
> > > > >
> > > >
> > >
> >
> http://stackoverflow.com/questions/3300265/lucene-3-iterating-over-all-hits
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Anshum Gupta
> > > > > > http://ai-cafe.blogspot.com
> > > > > >
> > > > > >
> > > > > > On Tue, Mar 22, 2011 at 4:10 PM, <karl.wright@nokia.com>
wrote:
> > > > > >
> > > > > > > Not sure what your use case actually is, but it sounds
like you
> > may
> > > > be
> > > > > > > unclear how Lucene works.
> > > > > > >
> > > > > > > Each query clause you have will produce an iterator that
walks
> > over
> > > > the
> > > > > > > documents that match that clause.  All the documents from
the
> > > entire,
> > > > > > root
> > > > > > > query get scored.  The scoring evaluation per document
is also
> > > > related
> > > > > to
> > > > > > > the form of your query expression hierarchy.
> > > > > > >
> > > > > > > So, MatchAllDocsQuery is exactly what you want if you want
a
> > > document
> > > > > > > iterator that includes all documents in the index.  You
can
> > change
> > > > how
> > > > > > this
> > > > > > > is scored by extending MatchAllDocsQuery and writing a
custom
> > > scorer.
> > > > > > >
> > > > > > > Karl
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: ext Patrick Diviacco [mailto:patrick.diviacco@gmail.com]
> > > > > > > Sent: Tuesday, March 22, 2011 4:23 AM
> > > > > > > To: java-user@lucene.apache.org
> > > > > > > Subject: how to get all documents in the results ?
> > > > > > >
> > > > > > > I'm using the following code because I want to see the
entire
> > > > > collection
> > > > > > in
> > > > > > > my query results:
> > > > > > >
> > > > > > > //adding wildcards-term to see all results
> > > > > > > rest = new TermQuery(new Term("*","*"));
> > > > > > > booleanQuery.add(rest, BooleanClause.Occur.SHOULD);
> > > > > > >
> > > > > > > But it doesn't work, I only see the relevant docs and not
all
> the
> > > > other
> > > > > > > ones.
> > > > > > > How can I get all documents ordered by relevance instead
?
> > > > > > >
> > > > > > > ps. MatchAllDocsQuery is not a solution because I need
to
> specify
> > > my
> > > > > own
> > > > > > > custom query.
> > > > > > >
> > > > > > >
> > > ---------------------------------------------------------------------
> > > > > > > To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> > > > > > > For additional commands, e-mail:
> > java-user-help@lucene.apache.org
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message