Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8FED317ECB for ; Thu, 10 Sep 2015 15:25:54 +0000 (UTC) Received: (qmail 2947 invoked by uid 500); 10 Sep 2015 15:25:47 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 2843 invoked by uid 500); 10 Sep 2015 15:25:47 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 2667 invoked by uid 99); 10 Sep 2015 15:25:47 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Sep 2015 15:25:47 +0000 Date: Thu, 10 Sep 2015 15:25:46 +0000 (UTC) From: "Alan Woodward (JIRA)" To: dev@lucene.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (LUCENE-6785) Consider merging Query.rewrite() into Query.createWeight() MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14738913#comment-14738913 ] Alan Woodward commented on LUCENE-6785: --------------------------------------- I'm travelling at the moment, will put up a larger patch changing all the modules + solr when I get back (including Terry's fix, thank you!). I still have some tests failing around highlighting multiterm queries. The bits keeping the QueryCache happy are a bit hacky, but I think it's worth the pain of that to make the API nicer. Maybe in another issue we could look at using the Weights themselves as cache keys, rather than their parent queries? bq. dropping weights could be problematic since they can be expensive to create due to statistics collection One thought I had was that term statistics could be collected and cached by an object that's passed to createWeight(). That way we only collect stats for each term once per top-level query. This would also be a nicer solution than the searcher term cache I proposed in LUCENE-6561. > Consider merging Query.rewrite() into Query.createWeight() > ---------------------------------------------------------- > > Key: LUCENE-6785 > URL: https://issues.apache.org/jira/browse/LUCENE-6785 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Alan Woodward > Attachments: LUCENE-6785.patch > > > Prompted by the discussion on LUCENE-6590. > Query.rewrite() is a bit of an oddity. You call it to create a query for a specific IndexSearcher, and to ensure that you get a query implementation that has a working createWeight() method. However, Weight itself already encapsulates the notion of a per-searcher query. > You also need to repeatedly call rewrite() until the query has stopped rewriting itself, which is a bit trappy - there are a few places (in highlighting code for example) that just call rewrite() once, rather than looping round as IndexSearcher.rewrite() does. Most queries don't need to be called multiple times, however, so this seems a bit redundant. And the ones that do currently return un-rewritten queries can be changed simply enough to rewrite them. > Finally, in pretty much every case I can find in the codebase, rewrite() is called purely as a prelude to createWeight(). This means, in the case of for example large BooleanQueries, we end up cloning the whole query structure, only to throw it away immediately. > I'd like to try removing rewrite() entirely, and merging the logic into createWeight(), simplifying the API and removing the trap where code only calls rewrite once. What do people think? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org