Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2C92510940 for ; Fri, 5 Dec 2014 19:19:06 +0000 (UTC) Received: (qmail 57727 invoked by uid 500); 5 Dec 2014 19:19:02 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 57655 invoked by uid 500); 5 Dec 2014 19:19:02 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 57641 invoked by uid 99); 5 Dec 2014 19:19:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Dec 2014 19:19:01 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of roman.chyla@gmail.com designates 209.85.216.174 as permitted sender) Received: from [209.85.216.174] (HELO mail-qc0-f174.google.com) (209.85.216.174) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Dec 2014 19:18:57 +0000 Received: by mail-qc0-f174.google.com with SMTP id c9so959814qcz.33 for ; Fri, 05 Dec 2014 11:17:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=owqVOlnEenfKFF9GLhKjPZevu5HMo86dgckL7azAgxk=; b=tdjAdPNkqcgpCC0FBjMzueazKn0GXeQO8yx7LkUuKlhxO5O+ZlE6FrVDTbRwBZy4YI fZfOMPBSLcnlZAqjuL7iCjeklrKBMOcRCFrrQDM9IBldUwCQngbi42QsHcjzxQdfc/2M X437+Vu3zonMf6TCW3aJG4K78ujFDaO6G8FIxvnjujILXSrAq85+qnC3qyQdAtV8a8G5 KjIcZZaD/9hRbSLAsvUl7hwIkb0wATNqJyqZD/qP5N8lKqwUOFX8E8brfSp21f7LCY2W no2hrpyl948P5RJ51I9X4nPvzHEYb3wmHzRa9Ks5ZIH1leRJsBZqSSzB0R7thXtPlpp0 Nylw== X-Received: by 10.224.74.132 with SMTP id u4mr28596011qaj.61.1417807026506; Fri, 05 Dec 2014 11:17:06 -0800 (PST) MIME-Version: 1.0 Received: by 10.229.97.198 with HTTP; Fri, 5 Dec 2014 11:16:46 -0800 (PST) In-Reply-To: <2F6963EC-A87C-41C2-826A-F232B626E9F7@gmail.com> References: <6C96541C-48BA-4B88-8326-E9687D965E6E@gmail.com> <2F6963EC-A87C-41C2-826A-F232B626E9F7@gmail.com> From: Roman Chyla Date: Fri, 5 Dec 2014 14:16:46 -0500 Message-ID: Subject: Re: Anti-Pattern in lucent-join jar? To: "solr-user@lucene.apache.org" Content-Type: multipart/alternative; boundary=089e0129503e2cd83b05097cec76 X-Virus-Checked: Checked by ClamAV on apache.org --089e0129503e2cd83b05097cec76 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Not sure I understand. It is the searcher which executes the query, how would you 'convince' it to pass the query? First the Weight is created, weight instance creates scorer - you would have to change the API to do the passing (or maybe not...?) In my case, the relationships were across index segments, so I had to collect them first - but in some other situations, when you look only at the data inside one index segments, it _might_ be better to wait On Fri, Dec 5, 2014 at 1:25 PM, Darin Amos wrote: > Couldn=E2=80=99t you just keep passing the wrapped query and searcher dow= n to > Weight.scorer()? > > This would allow you to wait until the query is executed to do term > collection. If you want to protect against creating and executing the que= ry > with different searchers, you would have to make the query factory (or > constructor) only visible to the query parser or parser plugin? > > I might not have followed you, this discussing challenges my understandin= g > of Lucene and SOLR. > > Darin > > > > > On Dec 5, 2014, at 12:47 PM, Roman Chyla wrote: > > > > Hi Mikhail, I think you are right, it won't be problem for SOLR, but it > is > > likely an antipattern inside a lucene component. Because custom > components > > may create join queries, hold to them and then execute much later > against a > > different searcher. One approach would be to postpone term collection > until > > the query actually runs, I looked far and wide for appropriate place, b= ut > > only found createWeight() - but at least it does give developers NO > > opportunity to shoot their feet! ;-) > > > > Since it may serve as an inspiration to someone, here is a link: > > > https://github.com/romanchyla/montysolr/blob/master-next/contrib/adsabs/s= rc/java/org/apache/lucene/search/SecondOrderQuery.java#L101 > > > > roman > > > > On Fri, Dec 5, 2014 at 4:52 AM, Mikhail Khludnev < > mkhludnev@griddynamics.com > >> wrote: > > > >> Thanks Roman! Let's expand it for the sake of completeness. > >> Such issue is not possible in Solr, because caches are associated with > the > >> searcher. While you follow this design (see Solr userCache), and don't > >> update what's cached once, there is no chance to shoot the foot. > >> There were few caches inside of Lucene (old FieldCache, > >> CachingWrapperFilter, ExternalFileField, etc), but they are properly > mapped > >> onto segment keys, hence it exclude such leakage across different > >> searchers. > >> > >> On Fri, Dec 5, 2014 at 6:43 AM, Roman Chyla > wrote: > >> > >>> +1, additionally (as it follows from your observation) the query can > get > >>> out of sync with the index, if eg it was saved for later use and ran > >>> against newly opened searcher > >>> > >>> Roman > >>> On 4 Dec 2014 10:51, "Darin Amos" wrote: > >>> > >>>> Hello All, > >>>> > >>>> I have been doing a lot of research in building some custom queries > >> and I > >>>> have been looking at the Lucene Join library as a reference. I notic= ed > >>>> something that I believe could actually have a negative side effect. > >>>> > >>>> Specifically I was looking at the JoinUtil.createJoinQuery(=E2=80=A6= ) method > >> and > >>>> within that method you see the following code: > >>>> > >>>> TermsWithScoreCollector termsWithScoreCollector =3D > >>>> TermsWithScoreCollector.create(fromField, > >>>> multipleValuesPerDocument, scoreMode); > >>>> fromSearcher.search(fromQuery, termsWithScoreCollector); > >>>> > >>>> As you can see, when the JoinQuery is being built, the code is > >> executing > >>>> the query that is wraps with it=E2=80=99s own collector to collect a= ll the > >>> scores. > >>>> If I were to write a query parser using this library (which someone > has > >>>> done here), doesn=E2=80=99t this reduce the benefit of the SOLR quer= y cache? > >> The > >>>> wrapped query is being executing when the Join Query is being > >>> constructed, > >>>> not when it is executed. > >>>> > >>>> Thanks > >>>> > >>>> Darin > >>>> > >>> > >> > >> > >> > >> -- > >> Sincerely yours > >> Mikhail Khludnev > >> Principal Engineer, > >> Grid Dynamics > >> > >> > >> > >> > > --089e0129503e2cd83b05097cec76--