Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 14FEE200CED for ; Fri, 18 Aug 2017 20:07:18 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 0CA8816D2D1; Fri, 18 Aug 2017 18:07:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 2BF6B16100F for ; Fri, 18 Aug 2017 20:07:17 +0200 (CEST) Received: (qmail 50673 invoked by uid 500); 18 Aug 2017 18:07:14 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 50661 invoked by uid 99); 18 Aug 2017 18:07:14 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Aug 2017 18:07:14 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 164411806F3 for ; Fri, 18 Aug 2017 18:07:14 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.681 X-Spam-Level: * X-Spam-Status: No, score=1.681 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, HTML_OBFUSCATE_05_10=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id qLdr7Vp_vkju for ; Fri, 18 Aug 2017 18:07:12 +0000 (UTC) Received: from mail-qt0-f180.google.com (mail-qt0-f180.google.com [209.85.216.180]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id CE38E5F3D0 for ; Fri, 18 Aug 2017 18:07:11 +0000 (UTC) Received: by mail-qt0-f180.google.com with SMTP id t37so57632043qtg.5 for ; Fri, 18 Aug 2017 11:07:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=FUZ+MgDNbjVVjgf7gGTCEidqDj/jtFOhzY5tEX157Jk=; b=FdpiuwbhrknVIUiDqdGCpcd9D1q0b50bkDLFiTiqCv4kLc9mJKQbm88Cnf7GTtv8jA onETT0SRBDRa0MA4jQx6Fk64ZV8B8/A+f7h3s6Q4AtrF4NmIVP8SCjpct45+7m/g8hLE Id0oHvaPJwaAJoPChZ6HqnbU+J2GpsteR8lk7HqNNS0dDLABhP9e5C+oAkM6Wt/nQfME 545395cBZ2SaWaOu8P2uib7sDdfZkeaK/IQXfHW40WLTr8tTe+Rymz0ihT4Y0v80fazA N+uev5HuKFa+cwR9vUDFej8k9fZX3wYMJhl2qz9GIjTZN7oNp8Op+59+it5Rs22vyGde x9Eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=FUZ+MgDNbjVVjgf7gGTCEidqDj/jtFOhzY5tEX157Jk=; b=Qi8OIgjykYX9W0nHM/U0wDjFxIxX/gdRE9upYUM4BWwm7RXYYZlCJj6EW1IrLW1kUi 4wANw+pJKTu4TdkPcIykCBvBvNvZI/VUWEJMqDw3CxsY/l/sa9jU5qQOgZRcWu8QwIqr kLlUaKIqjdTeYhajPq1824eZPS+2xsHmdJFLGmm42k9vsthYy3c6/rGuA6y1pPV5Tcd7 PbR+L+XIPZdPfO2s4kkDK/lv1z9HDl0AtpKveMaX8j1y0NWhKz3zTKJM425fXgYliXV2 Fa/oPJsm9RoS0MJGwr3lZ0ZGmUB88ARMRMx3p/W4QbkyiHHuvNH5b+OLHaRkJo6UDWTq SI/g== X-Gm-Message-State: AHYfb5jivXydoOYUuHG/IdroHjpM3PGBNvl90xh2izg5uC8m8LP2KkFQ +FvSNRPPdBRGTYpQhdqpRJ4Fz0KmGraMEfw= X-Received: by 10.200.46.161 with SMTP id h30mr14449763qta.202.1503079624647; Fri, 18 Aug 2017 11:07:04 -0700 (PDT) MIME-Version: 1.0 Received: by 10.200.24.172 with HTTP; Fri, 18 Aug 2017 11:07:04 -0700 (PDT) In-Reply-To: References: From: Nawab Zada Asad Iqbal Date: Fri, 18 Aug 2017 11:07:04 -0700 Message-ID: Subject: Re: Request Highlighting only for the final set of rows To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary="001a11c1106a18b04d05570aff45" archived-at: Fri, 18 Aug 2017 18:07:18 -0000 --001a11c1106a18b04d05570aff45 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Actually, part of me is thinking that there are valid use cases for having fl and hl.fl with different values. e.g, receive name etc. in =E2=80=9Cclea= n=E2=80=9D form in fl field and receive both name and address in html formatted form (by specifying in hl.fl) On Fri, Aug 18, 2017 at 10:57 AM, Nawab Zada Asad Iqbal wrote: > Actually, i realize that it is an incorrect use on my part to pass only > id+score in fl and specify more fields in the hl.fl fields. This was > somehow supported in older versions but the new behavior is actually a > performance improvement for the scenario when user is asking for only ids= . > > > Nawab > > On Fri, Aug 18, 2017 at 8:33 AM, Nawab Zada Asad Iqbal > wrote: > >> Thanks Erick for the pointing to better option. I will explore that. >> After your email, I found that if i have specified 'fl=3D*' in the query= then >> it is doing the right thing (a 2 pass process). However, my queries had >> 'fl=3Did+score' (or sometimes fl=3Did&fl=3Dscore), in both of these case= s I found >> that the shards are asked for highlighting all the results on the first >> request (and there is no second request). >> >> The fl=3D* query is (in my sample case) finishing in 100 msec while same >> query with fl=3Did+score finishes in 1200 msec. >> >> Here are the two queries; >> >> http://solrdev.test.net:8984/solr/filesearch/select?&hl=3Don&f >> l=3D*&start=3D200&rows=3D200&q=3Dnawab&shards=3Dsolrdev.test.net:8984/ >> solr/filesearch,solrdev.test.net:8985/solr/filesearch,solrd >> ev.test.net:8986/solr/filesearch&wt=3Djson >> >> >> http://solrdev.test.net:8984/solr/filesearch/select?&hl=3Don&f >> l=3Did&fl=3Dscore&start=3D200&rows=3D200&q=3Dnawab&shards=3Dsolrdev.test >> .net:8984/solr/filesearch,solrdev.test.net:8985/solr/filesea >> rch,solrdev.test.net:8986/solr/filesearch&wt=3Djson >> >> >> Thanks >> Nawab >> >> >> >> >> On Fri, Aug 18, 2017 at 7:23 AM, Erick Erickson >> wrote: >> >>> I don't think you're reading it correctly. First of all, if you're >>> going to do be doing deep paging you should be using cusorMark, see: >>> https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results. >>> >>> Second, it's a two-pass process if you don't use cursormark. The first >>> pass gets the candidate docs from each shard. But all it returns is >>> the ID and sort criteria. Then the aggregator node gets the _true_ top >>> N after sorting all the lists from each shard and issues a second >>> request for _only_ those docs that have made the top N from each sub >>> shard, and those should be the only ones highlighted. >>> >>> Do you have any evidence to the contrary that they're all being >>> highlighted? Or are you misinterpreting the log message for the first >>> pass? >>> >>> Best, >>> Erick >>> >>> On Thu, Aug 17, 2017 at 5:43 PM, Nawab Zada Asad Iqbal >>> wrote: >>> > Hi, >>> > >>> > In a multi-node solr installation (without SolrCloud), during a pagin= g >>> > scenario (e.g., start=3D1000, rows=3D200), the primary node asks for = 1200 >>> rows >>> > from each shard. If highlighting is ON, then the primary node is >>> asking for >>> > highlighting all the 1200 results from each shard, which doesn't scal= e >>> > well. Is there a way to break the shard query in two steps e.g. ask >>> for the >>> > 1200 rows and after sorting the 1200 responses from each shard and >>> finding >>> > final rows to return (1001 to 1200) , issue another query to shards f= or >>> > asking highlighted response for the relevant docs? >>> > >>> > >>> > >>> > Thanks >>> > Nawab >>> >> >> > --001a11c1106a18b04d05570aff45--