Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CA9F110427 for ; Fri, 5 Dec 2014 17:50:02 +0000 (UTC) Received: (qmail 26248 invoked by uid 500); 5 Dec 2014 17:49:58 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 26175 invoked by uid 500); 5 Dec 2014 17:49:57 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 26156 invoked by uid 99); 5 Dec 2014 17:49:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Dec 2014 17:49:57 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of roman.chyla@gmail.com designates 209.85.216.172 as permitted sender) Received: from [209.85.216.172] (HELO mail-qc0-f172.google.com) (209.85.216.172) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Dec 2014 17:49:53 +0000 Received: by mail-qc0-f172.google.com with SMTP id m20so848335qcx.3 for ; Fri, 05 Dec 2014 09:48:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=YgEhUfr4SsaLPniUN60elHC0kOiXb2pUtavo37uXl9A=; b=Tl0EZFwMbdXCioUBkrixDfEJrRQLjDy6hbcSTK03CtO7mMZgIqa8xAkn5Dd6Q1UMzs u08arSotE5ZpFhI2pWzdrQIPv8TWmRPiYsp9pQzX84vGBIhBH37pPxN6ExZotLWMYg1h dLR7uVgdimOkb2z/gMOi8LwLBD+h8CXFfVp9x8EnamMLIVBA7HPkq4yKlfwDJxZEuJUM iuqAJMgTrug68Q26odvvRA4L+UtXTmLgOdXM6EBDPMOua4WtfPwSwqA8kXko9ZprK2gO /lB0eZWtlAdGrS6xfvQTFYSmtFtMEm35Vf+v68T/y6cc9+LaXo0Y4AZCw86Jwmi9lY4U zlcQ== X-Received: by 10.224.75.193 with SMTP id z1mr27204414qaj.91.1417801682331; Fri, 05 Dec 2014 09:48:02 -0800 (PST) MIME-Version: 1.0 Received: by 10.229.97.198 with HTTP; Fri, 5 Dec 2014 09:47:42 -0800 (PST) In-Reply-To: References: <6C96541C-48BA-4B88-8326-E9687D965E6E@gmail.com> From: Roman Chyla Date: Fri, 5 Dec 2014 12:47:42 -0500 Message-ID: Subject: Re: Anti-Pattern in lucent-join jar? To: "solr-user@lucene.apache.org" Content-Type: multipart/alternative; boundary=001a11c2f47aa3394905097badc3 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c2f47aa3394905097badc3 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Mikhail, I think you are right, it won't be problem for SOLR, but it is likely an antipattern inside a lucene component. Because custom components may create join queries, hold to them and then execute much later against a different searcher. One approach would be to postpone term collection until the query actually runs, I looked far and wide for appropriate place, but only found createWeight() - but at least it does give developers NO opportunity to shoot their feet! ;-) Since it may serve as an inspiration to someone, here is a link: https://github.com/romanchyla/montysolr/blob/master-next/contrib/adsabs/src= /java/org/apache/lucene/search/SecondOrderQuery.java#L101 roman On Fri, Dec 5, 2014 at 4:52 AM, Mikhail Khludnev wrote: > Thanks Roman! Let's expand it for the sake of completeness. > Such issue is not possible in Solr, because caches are associated with th= e > searcher. While you follow this design (see Solr userCache), and don't > update what's cached once, there is no chance to shoot the foot. > There were few caches inside of Lucene (old FieldCache, > CachingWrapperFilter, ExternalFileField, etc), but they are properly mapp= ed > onto segment keys, hence it exclude such leakage across different > searchers. > > On Fri, Dec 5, 2014 at 6:43 AM, Roman Chyla wrote= : > > > +1, additionally (as it follows from your observation) the query can ge= t > > out of sync with the index, if eg it was saved for later use and ran > > against newly opened searcher > > > > Roman > > On 4 Dec 2014 10:51, "Darin Amos" wrote: > > > > > Hello All, > > > > > > I have been doing a lot of research in building some custom queries > and I > > > have been looking at the Lucene Join library as a reference. I notice= d > > > something that I believe could actually have a negative side effect. > > > > > > Specifically I was looking at the JoinUtil.createJoinQuery(=E2=80=A6)= method > and > > > within that method you see the following code: > > > > > > TermsWithScoreCollector termsWithScoreCollector =3D > > > TermsWithScoreCollector.create(fromField, > > > multipleValuesPerDocument, scoreMode); > > > fromSearcher.search(fromQuery, termsWithScoreCollector); > > > > > > As you can see, when the JoinQuery is being built, the code is > executing > > > the query that is wraps with it=E2=80=99s own collector to collect al= l the > > scores. > > > If I were to write a query parser using this library (which someone h= as > > > done here), doesn=E2=80=99t this reduce the benefit of the SOLR query= cache? > The > > > wrapped query is being executing when the Join Query is being > > constructed, > > > not when it is executed. > > > > > > Thanks > > > > > > Darin > > > > > > > > > -- > Sincerely yours > Mikhail Khludnev > Principal Engineer, > Grid Dynamics > > > > --001a11c2f47aa3394905097badc3--