Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 48831 invoked from network); 28 Oct 2009 23:31:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 28 Oct 2009 23:31:24 -0000 Received: (qmail 63410 invoked by uid 500); 28 Oct 2009 23:31:23 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 63330 invoked by uid 500); 28 Oct 2009 23:31:23 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 63322 invoked by uid 99); 28 Oct 2009 23:31:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Oct 2009 23:31:23 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Oct 2009 23:31:20 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 60139234C045 for ; Wed, 28 Oct 2009 16:30:59 -0700 (PDT) Message-ID: <1616967874.1256772659371.JavaMail.jira@brutus> Date: Wed, 28 Oct 2009 23:30:59 +0000 (UTC) From: "Mark Miller (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-2013) QueryScorer and SpanRegexQuery are incompatible. In-Reply-To: <1412089481.1256764139484.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771196#action_12771196 ] Mark Miller commented on LUCENE-2013: ------------------------------------- Thanks for the report Benjamin - Not sure I like adding the methods to SpanQuerys though - how about putting a check for regex query before the check for spanquery, and rewriting if we see it? It means adding the contrib with regex as a dependency of the highlighter, but it lets us avoid modifying any core classes. > QueryScorer and SpanRegexQuery are incompatible. > ------------------------------------------------ > > Key: LUCENE-2013 > URL: https://issues.apache.org/jira/browse/LUCENE-2013 > Project: Lucene - Java > Issue Type: Bug > Components: contrib/highlighter > Affects Versions: 2.9 > Environment: Lucene-Java 2.9 > Reporter: Benjamin Keil > Attachments: lucene-2013-2009-10-28-2135.patch, lucene-2013-2009-10-28.patch > > > Since the resolution of #LUCENE-1685, users are not supposed to rewrite their queries before submitting them to QueryScorer: > bq.------------------------------------------------------------------------ > bq.r800796 | markrmiller | 2009-08-04 06:56:11 -0700 (Tue, 04 Aug 2009) | 1 line > bq. > bq.LUCENE-1685: The position aware SpanScorer has become the default scorer for Highlighting. The SpanScorer implementation has replaced QueryScorer and the old term highlighting QueryScorer has been renamed to QueryTermScorer. Multi-term queries are also now expanded by default. If you were previously rewritting the query for multi-term query highlighting, you should no longer do that (unless you switch to using QueryTermScorer). The SpanScorer API (now QueryScorer) has also been improved to more closely match the API of the previous QueryScorer implementation. > bq.------------------------------------------------------------------------ > This is a great convenience for the most part, but it's causing me difficulties with SpanRegexQuerys, as the WeightedSpanTermExtractor uses Query.extractTerms() to collect the fields used in the query, but SpanRegexQuery does not implement this method, so highlighting any query with a SpanRegexQuery throws an UnsupportedOpertationException. If this issue is circumvented, there is still the issue of SpanRegexQuery throwing an exception when someone calls its getSpans() method. > I can provide the patch that I am currently using, but I'm not sure that my solution is optimal. It adds two methods to SpanQuery: extractFields(Set fields) which is equivalent to fields.add(getField()) except when MaskedFieldQuerys get involved, and mustBeRewrittenToGetSpans() which returns true for SpanQuery, false for SpanTermQuery, and is overridden in each composite SpanQuery to return a value depending on its components. In this way SpanRegexQuery (and any other custom SpanQuerys) do not need to be adjusted. > Currently the collection of fields and non-weighted terms are done in a single step. In the proposed patch the WeightedSpanTerm extraction from a SpanQuery proceeds in two steps. First, if the QueryScorer's field is null, then the fields are collected from the SpanQuery using the extractFields() method. Second the terms are collected using extractTerms(), rewriting the query for each field if mustBeRewrittenToGetSpans() returns true. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org