Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 60758 invoked from network); 23 Mar 2011 07:45:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 23 Mar 2011 07:45:17 -0000 Received: (qmail 27557 invoked by uid 500); 23 Mar 2011 07:45:15 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 27509 invoked by uid 500); 23 Mar 2011 07:45:15 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 27498 invoked by uid 99); 23 Mar 2011 07:45:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Mar 2011 07:45:15 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of patrick.diviacco@gmail.com designates 209.85.161.48 as permitted sender) Received: from [209.85.161.48] (HELO mail-fx0-f48.google.com) (209.85.161.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Mar 2011 07:45:10 +0000 Received: by fxm7 with SMTP id 7so9823469fxm.35 for ; Wed, 23 Mar 2011 00:44:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=rTzDKGoEi4VAa5db6J6IJiLAJ8IyDLYs5SMSsU1dUXA=; b=g4JRIu6rRY0zaLVZNXdtk6skUd9uvWdpVlExGxBizkotoocng8srlf7E5G14WTz5Hy zc0649rX0NptghHPydOsiihdIx0jYuEdXc0BBiOb4m2vwQ30dj8R1auaK80Ochw5bZ76 3IagmU9NATOWtNhc+8wzimzYgHX5oYXK7F5qI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=sE/c5nMafhoQbFQeLrg8ZPIWLG7mpi6NI6AXj7IXOruBaygugMOGk+BZLZG+3yO8gz U2ldGcvWY8SAX3zv6KpPS5VegD6JsMkNquma3U55iv+R1y7FZ+dlGV01zYKkIe5t1DYk DCKr7QPIaqV/dW+rpE1kndkJ+rsUQWnijpedY= MIME-Version: 1.0 Received: by 10.223.27.18 with SMTP id g18mr5097274fac.52.1300866289474; Wed, 23 Mar 2011 00:44:49 -0700 (PDT) Received: by 10.223.121.193 with HTTP; Wed, 23 Mar 2011 00:44:49 -0700 (PDT) In-Reply-To: References: <0C2ADA45C80B224FAFA38F5DEE16A16E06048B@008-AM1MPN1-037.mgdnok.nokia.com> Date: Wed, 23 Mar 2011 08:44:49 +0100 Message-ID: Subject: Re: how to get all documents in the results ? From: Patrick Diviacco To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=00151748e0a8159ba8049f2189c2 --00151748e0a8159ba8049f2189c2 Content-Type: text/plain; charset=ISO-8859-1 The issue with My confusion about MatchAllDocsQuery is that I cannot specify which terms in which fields to search with it. I'm probably wrong. I currently have a BooleanQuery, that I use to build the query with several fields and several terms. Can I just pass MatchAllDocsQuery to BooleanQuery.add method in order to add all remaining docs ? thanks On 22 March 2011 12:42, Anshum wrote: > MatchAllDocs does not consider only a single field but all fields i.e. it > takes a *:* query. > > *1. * > *****Snip**** > Query query = new MatchAllDocsQuery(); > TopDocs td = is.search(query, ir.numDocs()); > ScoreDoc[ ] scoreDocs = td.scoreDocs; > for(ScoreDoc scoreDoc:scoreDocs){ > > ... Your code... > > } > ****/Snip*** > > *2. Collector approach:* > > Query query = new MatchAllDocsQuery(); > int hitsPerPage = ir.numDocs(); > TopScoreDocCollector collector = TopScoreDocCollector.create(hitsPerPage, > true); > is.search(query, collector); > ScoreDoc[] scoreDocs = collector.topDocs().scoreDocs; > for(ScoreDoc scoreDoc:scoreDocs){ > > .. Your code.. > > } > > The better part about using a collector based approach is that you could do > a lot in the collect step i.e. everything that you'd do in the iteration > post search could also be done while collecting here. > > Also, could you also tell me the exact case as to what is it that you are > trying to achieve. You may have a completely different option that you > haven't read which someone could advice if they know the exact intent. > > Hope this helps. > > -- > Anshum Gupta > http://ai-cafe.blogspot.com > > > On Tue, Mar 22, 2011 at 4:59 PM, Patrick Diviacco < > patrick.diviacco@gmail.com> wrote: > > > 1. "all" docs > > > > 2. because matchalldocs only consider one field at once. I'm searching > over > > multiple fields instead. > > > > 3. could you tell me more about this ? It might be a solution! > > > > > > > > On 22 March 2011 12:18, Anshum wrote: > > > > > so a few things > > > 1. are you looking to get 'all' documents or only docs matching your > > query? > > > 2. if its about fetching all docs, why not use the matchalldocs query? > > > 3. did you try using a collector instead of topdocs? > > > > > > -- > > > Anshum Gupta > > > http://ai-cafe.blogspot.com > > > > > > > > > On Tue, Mar 22, 2011 at 4:46 PM, Patrick Diviacco < > > > patrick.diviacco@gmail.com> wrote: > > > > > > > I don't think the link you suggested can help, but maybe I'm wrong. > > > > > > > > Also, the parameter MAX_HITS is not useful, it just limit the > results, > > it > > > > doesn't add the not relevant docs. > > > > > > > > > > > > > > > > On 22 March 2011 12:10, Anshum wrote: > > > > > > > > > Hi Patrick, > > > > > You may have a look at this, perhaps this will help you with it. > Let > > me > > > > > know > > > > > if you're still stuck up. > > > > > > > > > > > > > > > http://stackoverflow.com/questions/3300265/lucene-3-iterating-over-all-hits > > > > > > > > > > > > > > > -- > > > > > Anshum Gupta > > > > > http://ai-cafe.blogspot.com > > > > > > > > > > > > > > > On Tue, Mar 22, 2011 at 4:10 PM, wrote: > > > > > > > > > > > Not sure what your use case actually is, but it sounds like you > may > > > be > > > > > > unclear how Lucene works. > > > > > > > > > > > > Each query clause you have will produce an iterator that walks > over > > > the > > > > > > documents that match that clause. All the documents from the > > entire, > > > > > root > > > > > > query get scored. The scoring evaluation per document is also > > > related > > > > to > > > > > > the form of your query expression hierarchy. > > > > > > > > > > > > So, MatchAllDocsQuery is exactly what you want if you want a > > document > > > > > > iterator that includes all documents in the index. You can > change > > > how > > > > > this > > > > > > is scored by extending MatchAllDocsQuery and writing a custom > > scorer. > > > > > > > > > > > > Karl > > > > > > > > > > > > -----Original Message----- > > > > > > From: ext Patrick Diviacco [mailto:patrick.diviacco@gmail.com] > > > > > > Sent: Tuesday, March 22, 2011 4:23 AM > > > > > > To: java-user@lucene.apache.org > > > > > > Subject: how to get all documents in the results ? > > > > > > > > > > > > I'm using the following code because I want to see the entire > > > > collection > > > > > in > > > > > > my query results: > > > > > > > > > > > > //adding wildcards-term to see all results > > > > > > rest = new TermQuery(new Term("*","*")); > > > > > > booleanQuery.add(rest, BooleanClause.Occur.SHOULD); > > > > > > > > > > > > But it doesn't work, I only see the relevant docs and not all the > > > other > > > > > > ones. > > > > > > How can I get all documents ordered by relevance instead ? > > > > > > > > > > > > ps. MatchAllDocsQuery is not a solution because I need to specify > > my > > > > own > > > > > > custom query. > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > > > > > For additional commands, e-mail: > java-user-help@lucene.apache.org > > > > > > > > > > > > > > > > > > > > > > > > > > > --00151748e0a8159ba8049f2189c2--