From dev-return-358113-archive-asf-public=cust-asf.ponee.io@lucene.apache.org Fri Jun 7 11:59:04 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 2780218067E for ; Fri, 7 Jun 2019 13:59:04 +0200 (CEST) Received: (qmail 97040 invoked by uid 500); 7 Jun 2019 11:59:02 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 97012 invoked by uid 99); 7 Jun 2019 11:59:02 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Jun 2019 11:59:02 +0000 Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id AB9E1E2D44 for ; Fri, 7 Jun 2019 11:59:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 9C0032464D for ; Fri, 7 Jun 2019 11:59:00 +0000 (UTC) Date: Fri, 7 Jun 2019 11:59:00 +0000 (UTC) From: =?utf-8?Q?Jan_H=C3=B8ydahl_=28JIRA=29?= To: dev@lucene.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SOLR-13367) Highlighting fails for Range queries on Multi-valued String fields MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SOLR-13367?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1685= 8553#comment-16858553 ]=20 Jan H=C3=B8ydahl commented on SOLR-13367: ------------------------------------ Hi I wonder if the reason is a bug in the Admin UI query section which sets pa= rameter "hightlightMultiTerm" instead of the correct "highlightMultiTerm". = Can you try your query again manually from browser address bar with correct= param? See [https://github.com/apache/lucene-solr/pull/704]=C2=A0for a fix > Highlighting fails for Range queries on Multi-valued String fields > ------------------------------------------------------------------ > > Key: SOLR-13367 > URL: https://issues.apache.org/jira/browse/SOLR-13367 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public)=20 > Components: highlighter > Affects Versions: 7.5, 7.7.1 > Environment: RedHat Linux v7 > Java 1.8.0_201 > Reporter: Karl Wolf > Priority: Major > Fix For: 5.1 > > > Range queries against multi-valued string fields produces useless highlig= hting, even though "hl.highlightMultiTerm":"true" > I have uncovered what I believe is a bug. At the very lease it is a diffe= rence in behavior between Solr v5.1.0 and v7.5.0 (and v7.7.1). > I have a multi-valued string Field defined in my schema as: > =20 > > I am using a query containing a Range clause and I am using highlighting = to get the list of values that actually matched the range query. > All examples below were using the appropriate Solr Admin Server SolrCore = Query page. > *************************************************************************= ** > First, a correctly working example of a range query using Solr v5.1.0 whi= ch produces useful results: > { > "responseHeader": { > "status": 0, > "QTime": 366, > "params": { > "q": "MyStringField:[A TO B}", > "hl": "true", > "indent": "true", > "hl.preserveMulti": "true", > "fl": "MyStringField,MyUniqueID", > "hl.requireFieldMatch": "true", > "hl.usePhraseHighlighter": "true", > "hl.fl": "MyStringField", > "wt": "json", > "hl.highlightMultiTerm": "true", > "_": "1553275722025" > } > }, > "response": { > "numFound": 999, > "start": 0, > "docs": [ > { > "MyStringField": [ > "Stanley, Wendell M.", > "Avery, Roy" > ], > "MyUniqueID": "UniqueID1" > }, > { > "MyStringField": [ > "Avery, Roy" > ], > "MyUniqueID": "UniqueID2" > }, > *** lots more docs correctly found > ] > }, > *** we get to the highlighting portion of the response > *** this indicates which values of each MyStringField > *** that actually matched the query > "highlighting": { > "UniqueID1": { > "MyStringField": [ > "Avery, Roy" > ] > }, > "UniqueID2": { > "MyStringField": [ > "Avery, Roy" > ] > }, > "UniqueID3": { > "MyStringField": [ > "American Institute of Biological Sciences", > "Albritton, Errett C." > ] > }, > ... etc. > *** lots more useful highlight values. Note the two matching values > *** for document UniqueID3.=20 > } > *************************************************************************= ** > * THE PROBLEM > * Now using newer versions of Solr > *************************************************************************= ** > Using the exact same parameters with Solr v7.5.0 or v7.7.1, the top porti= on of the=20 > response is basically the same including the number of documents found > { > "responseHeader":{ > "status":0, > "QTime":245, > "params":{ > "q":"MyStringField:[A TO B}", > "hl":"on", > "hl.preserveMulti":"true", > "fl":"MyUniqueID, MyStringField", > "hl.requireFieldMatch":"true", > "hl.fl":"MyStringField", > "hightlightMultiTerm":"true", > "wt":"json", > "_":"1553105129887", > "usePhraseHighLighter":"true"}}, > "response":{"numFound":999,"start":0,"docs":[ > *** The problem is with the lighlighting portion of the results, which is= effectively empty.=20 > *** There is no way to know what values in each document that actually ma= tched the query: > "highlighting":{ > "UniqueID1":{}, > "UniqueID2":{}, > "UniqueID3":{}, > ... etc. > *** NOTE: The source data is the same for all of the tested Solr versions= and the Solr indexes > *** were properly rebuilt for each Solr version.=20 > *************************************************************************= ** > Changing the request to using the "unified" highlighter: "hl.method=3Duni= fied", the highlighting looks like: > "highlighting":{ > "UniqueID1":{ > "MyStringField":[]}, > "UniqueID2":{ > "MyStringField":[]}, > "UniqueID3":{ > "MyStringField":[]}, > ... etc. > *** The highlighting now properly lists the matching field but still no u= seful values are listed. > *************************************************************************= ** > NOTE: if I change the query from using a Range clause to using a Wildcard= query: q=3D"MyStringField:A*" > the highlighting is correct in both Solr v7.5.0 and v7.7.1: These are GOO= D results! > "highlighting":{ > "UniqueID1": { > "MyStringField": ["Avery, Roy"]}, > "UniqueID2": { > "MyStringField": ["Avery, Roy"]}, > "UniqueID3": { > "MyStringField": [ > "American Institute of Biological Sciences", > "Albritton, Errett C." > ] > }, > ... etc. > *** This makes me think there is some problem with the way a Range query > *** feeds the search results to the Solr Highlighter code. > *************************************************************************= ** > All attempts to vary the hl specs or any other query parameters do not so= lve the problem. > The wildcard query is my current work around but there still is a problem= with > range queries: > In summary, there is some incompatibility among: > =091) A multi-valued string field AND > =092) A range query against that field AND > =093) The result Highlighting. It is effectively empty. > I don't know when this issue was first introduced. I have recently been u= pdating from 5.1.0 > to 7.5.0 in one big leap. I have attempted to read through the change log= s for the intervening > versions but I gave up to save my sanity. > You should be able to reproduce this issue using any multi-valued, indexe= d and stored string field. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org