Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3B068200C0E for ; Wed, 1 Feb 2017 21:23:57 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 399CA160B46; Wed, 1 Feb 2017 20:23:57 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 5D857160B41 for ; Wed, 1 Feb 2017 21:23:56 +0100 (CET) Received: (qmail 64181 invoked by uid 500); 1 Feb 2017 20:23:54 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 64048 invoked by uid 99); 1 Feb 2017 20:23:54 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Feb 2017 20:23:54 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id F10161A0103 for ; Wed, 1 Feb 2017 20:23:53 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.219 X-Spam-Level: X-Spam-Status: No, score=-0.219 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=insystechinc-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id JPbvuRxgy_Lw for ; Wed, 1 Feb 2017 20:23:51 +0000 (UTC) Received: from mail-qt0-f173.google.com (mail-qt0-f173.google.com [209.85.216.173]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 70FFF5F30D for ; Wed, 1 Feb 2017 20:23:51 +0000 (UTC) Received: by mail-qt0-f173.google.com with SMTP id x49so282470732qtc.2 for ; Wed, 01 Feb 2017 12:23:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=insystechinc-com.20150623.gappssmtp.com; s=20150623; h=from:to:references:in-reply-to:subject:date:message-id:mime-version :content-transfer-encoding:thread-index:content-language; bh=XgXfA7P838EEEF0acJAzft3hd84AZl3V9krEQSpXTY8=; b=xvwoPMOcjpkpeUCZPrQ1Rwdkm5yI1rzl2qk3EKp/FkWBU5r9BNZZejmFtiSq/jVY/p V0D3YWA9ylAwQ72anvocsnujqxtKLAk+J4XKrcJPBNRuIPRV+6wHdWUn3couRjA+T2AG BeEM9jbW1yXdfAv5UXnllwjfc5L7NDoT1ug87AZZMmq+ZFtEvjHlDCnA2zUmQbBa2vnc nOS7PB7VeKvxATjxNHtOWdXtq21R6cht6MCym/6uBDs3qdKn858zOFUrFt9X5upR7B1O 5Xud8vHiEf2s90KXCAlGMJw37oa92o9Ox+aocaqRO/ud9ThJoAOzYL3Zfvt+nIZwNXf4 gwdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:references:in-reply-to:subject:date :message-id:mime-version:content-transfer-encoding:thread-index :content-language; bh=XgXfA7P838EEEF0acJAzft3hd84AZl3V9krEQSpXTY8=; b=LQ39lRg8+gH8mcp2r5AAEs8NOPrA6mKmYrq3EFIeyHHPa+mjgWj2TG9KOA6EVmvDpJ Tbf+jxRdkDdycyX6lMtTMqwB1L9chvo+vy6UzfwggQ1lzl/eNpkRRJgd6rZoXKaVheSS RHWP5Rw6Sp45KB2GXyo+VJ08Jyww/BYOmrIvlgdRSq2iUXg576NFFIKMWGBhBKhbNv9Q JswX4HAZY/F8ruH1Mx8iSNj9p6Kz1xj/A4ZMaleHB4s0ce64ZnmPOnT7F16sfVm1H/wG W7ezh9/nhTOkCiROnjG7FUYBR0jijSRJWe2YzRchxvfhaHGpUrp+04OxhAT5lVLhiX5c Lajg== X-Gm-Message-State: AIkVDXKW2BZtDJ10/RnAPwZBcaVEaFFjsL+JrwlhWcKmPSM3A/54ac6fQ/lSlDtp7HBqxQ== X-Received: by 10.55.165.148 with SMTP id o142mr5122815qke.78.1485980630895; Wed, 01 Feb 2017 12:23:50 -0800 (PST) Received: from LT050230 (pool-96-255-70-169.washdc.fios.verizon.net. [96.255.70.169]) by smtp.gmail.com with ESMTPSA id b190sm19549694qkg.32.2017.02.01.12.23.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 Feb 2017 12:23:50 -0800 (PST) From: "Teague James" To: References: <058001d27ca7$bc019a50$3404cef0$@insystechinc.com> In-Reply-To: Subject: RE: Solr 6.0.0 Returns Blank Highlights for alpha-numeric combos Date: Wed, 1 Feb 2017 15:23:21 -0500 Message-ID: <05aa01d27cc9$08e6cd60$1ab46820$@insystechinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 16.0 Thread-Index: AQIdwhwhlz35fCpjdbJrraa110orWwIsxceIoKyhWNA= Content-Language: en-us archived-at: Wed, 01 Feb 2017 20:23:57 -0000 Hi Erick! Thanks for the reply. The goal is to get two character terms = like 1a, 1b, 2a, 2b, 3a, etc. to get highlighted in the documents. = Additional testing shows that any alpha-numeric combo returns a blank = highlight, regardless of length. Thus, "pr0blem" will not highlight = because of the zero in the middle of the term. I came across a ServerFault article where it was suggested that the = fieldType must be tokenized in order for highlighting to work correctly. = Setting the field type to text_general was suggested as a solution. In = my case the data is stored as a string fieldType, which is then copied = using copyField to a field that has a fieldType of text_general, but I'm = still not getting a good highlight on terms like "1a". Highlighting = works for any other non-alpha-numeric term though. Other articles pointed to termVectors and termOffsets, but none of these = seemed to help. Here's my config: In the solrconfig file highlighting is set to use the text field: text=20 Thoughts? Appreciate the help! Thanks! -Teague -----Original Message----- From: Erick Erickson [mailto:erickerickson@gmail.com]=20 Sent: Wednesday, February 1, 2017 2:49 PM To: solr-user Subject: Re: Solr 6.0.0 Returns Blank Highlights for alpha-numeric = combos How far into the text field are these tokens? The highlighter defaults = to the first 10K characters under control of hl.maxAnalyzedChars. It's = vaguely possible that the values happen to be farther along in the text = than that. Not likely, mind you but possible. Best, Erick On Wed, Feb 1, 2017 at 8:24 AM, Teague James = wrote: > Hello everyone! I'm still stuck on this issue and could really use=20 > some help. I have a Solr 6.0.0 instance that is storing documents=20 > peppered with text like "1a", "2e", "4c", etc. If I search the=20 > documents for a word, "ms", "in", "the", etc., I get the correct=20 > number of hits and the results are highlighted correctly in the=20 > highlighting section. But when I search for "1a" or "2e" I get hits,=20 > but the highlights are blank. Further testing revealed that the=20 > highlighter fails to highlight any combination of alpha-numeric two = character value, such a n0, b1, 1z, etc.: > ... > > > > Where "8667" is the document ID of the record that had the hit, but no = > highlight. Other searches, "ms" for example, return: > ... > > > > > MS > > > > > > Why does highlighting fail for "1a" type searches? Any help is = appreciated! > Thanks! > > -Teague James >