Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 95070 invoked from network); 21 Aug 2009 14:17:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 21 Aug 2009 14:17:18 -0000 Received: (qmail 62291 invoked by uid 500); 21 Aug 2009 14:17:39 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 62214 invoked by uid 500); 21 Aug 2009 14:17:39 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 62206 invoked by uid 99); 21 Aug 2009 14:17:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Aug 2009 14:17:39 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Aug 2009 14:17:36 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id C21F8234C004 for ; Fri, 21 Aug 2009 07:17:14 -0700 (PDT) Message-ID: <1528373011.1250864234779.JavaMail.jira@brutus> Date: Fri, 21 Aug 2009 07:17:14 -0700 (PDT) From: "Tim Smith (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-1821) Weight.scorer() not passed doc offset for "sub reader" In-Reply-To: <1931573201.1250629228444.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745988#action_12745988 ] Tim Smith commented on LUCENE-1821: ----------------------------------- here's what you can do: {code} /** @deprecated use {@link getDocIdSet(IndexSearcher, IndexReader)} */ public DocIdSet getDocIdSet(final IndexReader reader) throws IOException { return getDocIdSet(new IndexSearcher(reader), reader); } public DocIdSet getDocIdSet(final IndexSearcher searcher, final IndexReader reader) { final Weight weight = query.weight(searcher); return new DocIdSet() { public DocIdSetIterator iterator() throws IOException { return weight.scorer(searcher, reader, true, false); } }; } {code} and yeah, i'm all for tons warnings in javadoc explicitly defining the contracts > Weight.scorer() not passed doc offset for "sub reader" > ------------------------------------------------------ > > Key: LUCENE-1821 > URL: https://issues.apache.org/jira/browse/LUCENE-1821 > Project: Lucene - Java > Issue Type: Bug > Components: Search > Affects Versions: 2.9 > Reporter: Tim Smith > Fix For: 2.9 > > Attachments: LUCENE-1821.patch > > > Now that searching is done on a per segment basis, there is no way for a Scorer to know the "actual" doc id for the document's it matches (only the relative doc offset into the segment) > If using caches in your scorer that are based on the "entire" index (all segments), there is now no way to index into them properly from inside a Scorer because the scorer is not passed the needed offset to calculate the "real" docid > suggest having Weight.scorer() method also take a integer for the doc offset > Abstract Weight class should have a constructor that takes this offset as well as a method to get the offset > All Weights that have "sub" weights must pass this offset down to created "sub" weights > Details on workaround: > In order to work around this, you must do the following: > * Subclass IndexSearcher > * Add "int getIndexReaderBase(IndexReader)" method to your subclass > * during Weight creation, the Weight must hold onto a reference to the passed in Searcher (casted to your sub class) > * during Scorer creation, the Scorer must be passed the result of YourSearcher.getIndexReaderBase(reader) > * Scorer can now rebase any collected docids using this offset > Example implementation of getIndexReaderBase(): > {code} > // NOTE: more efficient implementation can be done if you cache the result if gatherSubReaders in your constructor > public int getIndexReaderBase(IndexReader reader) { > if (reader == getReader()) { > return 0; > } else { > List readers = new ArrayList(); > gatherSubReaders(readers); > Iterator iter = readers.iterator(); > int maxDoc = 0; > while (iter.hasNext()) { > IndexReader r = (IndexReader)iter.next(); > if (r == reader) { > return maxDoc; > } > maxDoc += r.maxDoc(); > } > } > return -1; // reader not in searcher > } > {code} > Notes: > * This workaround makes it so you cannot serialize your custom Weight implementation -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org