lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Highlighting, all matches show empty {}
Date Wed, 12 Aug 2015 15:27:19 GMT
Well, the example you just showed shouldn't show any highlighting. Your query is
q=concord
so it's trying to highlight "concord" which isn't in any of your
documents. hl.q can be
used to highlight something other than your q parameter.

I did notice in some of your other examples that you seemed to be searching for
terms that were in the fields so I suspect this isn't really your root
problem though.

do note that fields _must_ be stored to have highlighting work. Is it
possible that your
matches are on fields that aren't stored?

Let's build it up slowly though, try searching on one term in one
field that you _know_
is stored and see if you get anything back. While the query with
hl.fl=* and fl=field1, field2,
should be fine, let's start as simply as possible and work up maybe?

Best,
Erick

On Wed, Aug 12, 2015 at 7:59 AM, Scott Derrick <scott@tnstaafl.net> wrote:
> I think the highlighter is actually running, but I'm not getting the
> results??
>
> with this request
>
> http://localhost:8983/solr/mbepp/select?q=concord&fl=accession%2C+title%2C+author%2C+date&wt=json&indent=true&hl=true&hl.fl=*
>
>
> I get this response
>
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":3,
>     "params":{
>       "q":"concord",
>       "hl":"true",
>       "indent":"true",
>       "fl":"accession, title, author, date",
>       "hl.fl":"*",
>       "wt":"json"}},
>   "response":{"numFound":3,"start":0,"docs":[
>       {
>         "date":"1890-02-26",
>         "author":"Mary Baker Eddy",
>         "accession":"L13943",
>         "title":["Mary Baker Eddy to Joseph E. Adams,"]},
>       {
>         "date":"1896-01-13",
>         "author":"Mary Baker Eddy",
>         "accession":"L03453",
>         "title":["Mary Baker Eddy to Ira O. Knapp,"]},
>       {
>         "date":"1902-06-15",
>         "author":"Mary Baker Eddy",
>         "accession":"A10145",
>         "title":["Message of the Pastor Emeritus to The First Church of
> Christ, Scientist, Boston, Mass., June 15, 1902"]}]
>   },
>   "highlighting":{
>
> "/home/scott/workspace/mbel-work/tei2html/build/web/L13943/L13943.html":{},
>
> "/home/scott/workspace/mbel-work/tei2html/build/web/L03453/L03453.html":{},
>
> "/home/scott/workspace/mbel-work/tei2html/build/web/A10145/A10145.html":{}}}
>
> When I ran the request.
> In the admin plubins/Stats I set "Watch Changes" before processing the
> request.  Highlighting showed 2 changes, the gapFragmenter and HTMLFormatter
>
> here are the reported changes
>
> org.apache.solr.highlight.GapFragmenter
>     class: org.apache.solr.highlight.GapFragmenter
>     version: 5.2.1
>     description: GapFragmenter
>     stats: requests: Was: 117, Now: 156, Delta: 39
>
> org.apache.solr.highlight.HtmlFormatter
>     class: org.apache.solr.highlight.HtmlFormatter
>     version:5.2.1
>     description:HtmlFormatter
>     stats: requests: Was: 117, Now: 156, Delta: 39
>
> Looks to me like there were 39 fragments or something processed, yet you can
> see above the highlights are empty {}???
>
> though all the the other libraries in the highlighter showed no changes.
>
> which are these...
>
>     org.apache.solr.highlight.BreakIteratorBoundaryScanner
>     org.apache.solr.highlight.HtmlEncoder
>     org.apache.solr.highlight.RegexFragmenter
>     org.apache.solr.highlight.ScoreOrderFragmentsBuilder
>     org.apache.solr.highlight.SimpleBoundaryScanner
>     org.apache.solr.highlight.SimpleFragListBuilder
>     org.apache.solr.highlight.SingleFragListBuilder
>     org.apache.solr.highlight.WeightedFragListBuilder
>
>
> Scott
>
> -------- Original Message --------
> Subject: Highlighting, all matches show empty {}
> From: Scott Derrick <scott@tnstaafl.net>
> To: solr-user@lucene.apache.org
> Date: 08/12/2015 08:20 AM
>
>> Tried submitting a filed for hl.fl still empty {}
>>
>> here are the query terms
>>
>> "responseHeader": {
>>      "status": 0,
>>      "QTime": 8,
>>      "params": {
>>        "q": "mary or calvin",
>>        "hl": "true",
>>        "hl.simple.post": "</em>",
>>        "indent": "true",
>>        "fl": "accession, title, author, date",
>>        "hl.fl": "*",
>>        "wt": "json",
>>        "hl.simple.pre": "<em>",
>>        "_": "1439388969240"
>>      }
>>
>> here is one of the responses, there were 135
>>
>> {
>>          "date": "1886-07-06",
>>          "author": "Mary Baker Eddy",
>>          "accession": "L02634",
>>          "title": [
>>            "Mary Baker Eddy to Josephine C. Woodbury, July 6, 1886"
>>          ]
>> },
>>
>> here is the highlight section listing the first 10 matches, still empty {}
>>
>> "highlighting": {
>>
>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L02634/L02634.html":
>> {},
>>
>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10720/A10720.html":
>> {},
>>
>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L07894/L07894.html":
>> {},
>>
>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L09828/L09828.html":
>> {},
>>
>>
>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10636D/A10636D.html":
>> {},
>>
>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L13943/L13943.html":
>> {},
>>
>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10594/A10594.html":
>> {},
>>
>>
>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10385B/A10385B.html":
>> {},
>>
>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10879/A10879.html":
>> {},
>>
>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L00003/L00003.html":
>> {}
>>    }
>>
>>
>> -------- Original Message --------
>> Subject: Re: Highlighting
>> From: Scott Derrick <scott@tnstaafl.net>
>> To: solr-user@lucene.apache.org
>> Date: 08/12/2015 06:39 AM
>>
>>> I was pretty sure I tried that, though I thought if you don't specify it
>>> just uses the search terms?
>>>
>>> If I just search for "calvin" and don't specify a field, what do I
>>> assign hl.fl?
>>>
>>> Scott
>>>
>>> On 8/11/2015 7:27 PM, Erik Hatcher wrote:
>>>>
>>>> Scott - doesn’t look you’ve specified hl.fl specifying which field(s)
>>>> to highlight.
>>>>
>>>> p.s. Erick Erickson surely likes your e-mail domain :)
>>>>
>>>>
>>>> —
>>>> Erik Hatcher, Senior Solutions Architect
>>>> http://www.lucidworks.com <http://www.lucidworks.com/>
>>>>
>>>>
>>>>
>>>>
>>>>> On Aug 11, 2015, at 9:02 PM, Scott Derrick <scott@tnstaafl.net>
wrote:
>>>>>
>>>>> I guess I really don't get Highlighting in Solr.
>>>>>
>>>>> We are transitioning from Google Custom Search which generally sucks,
>>>>> but does return nicely formatted highlighted fragment.
>>>>>
>>>>> I turn highlighting on hl=true in the query and I get a highlighting
>>>>> section returned at the bottom of the page, each identified by the
>>>>> document file name with a empty {} .  It doesn't matter what I search
>>>>> for, plain text, a field, I get a list of documents followed by an
>>>>> empty brace?
>>>>>
>>>>> "highlighting": {
>>>>>
>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10385B/A10385B.html":
>>>>>
>>>>> {},
>>>>>
>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10089/A10089.html":
>>>>>
>>>>> {},
>>>>>
>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L00003/L00003.html":
>>>>>
>>>>> {},
>>>>>
>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10646/A10646.html":
>>>>>
>>>>> {},
>>>>>
>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./V03482/V03482.html":
>>>>>
>>>>> {},
>>>>>
>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./A10594/A10594.html":
>>>>>
>>>>> {},
>>>>>
>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./645A.66.043/645A.66.043.html":
>>>>>
>>>>> {},
>>>>>
>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./352.48.001/352.48.001.html":
>>>>>
>>>>> {},
>>>>>
>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./144.23.001/144.23.001.html":
>>>>>
>>>>> {},
>>>>>
>>>>> "/home/scott/workspace/mbel-work/tei2html/build/web/./L18512/L18512.html":
>>>>>
>>>>> {}
>>>>>   }
>>>>>
>>>>> I haven't made any changes to the default settings
>>>>>
>>>>>    <highlighting>
>>>>>       <!-- Configure the standard fragmenter -->
>>>>>       <!-- This could most likely be commented out in the "default"
>>>>> case -->
>>>>>       <fragmenter name="gap"
>>>>>                   default="true"
>>>>>                   class="solr.highlight.GapFragmenter">
>>>>>         <lst name="defaults">
>>>>>           <int name="hl.fragsize">100</int>
>>>>>         </lst>
>>>>>       </fragmenter>
>>>>>
>>>>>       <!-- A regular-expression-based fragmenter
>>>>>            (for sentence extraction)
>>>>>         -->
>>>>>       <fragmenter name="regex"
>>>>>                   class="solr.highlight.RegexFragmenter">
>>>>>         <lst name="defaults">
>>>>>           <!-- slightly smaller fragsizes work better because of slop
>>>>> -->
>>>>>           <int name="hl.fragsize">70</int>
>>>>>           <!-- allow 50% slop on fragment sizes -->
>>>>>           <float name="hl.regex.slop">0.5</float>
>>>>>           <!-- a basic sentence pattern -->
>>>>>           <str name="hl.regex.pattern">[-\w
>>>>> ,/\n\&quot;&apos;]{20,200}</str>
>>>>>         </lst>
>>>>>       </fragmenter>
>>>>>
>>>>>       <!-- Configure the standard formatter -->
>>>>>       <formatter name="html"
>>>>>                  default="true"
>>>>>                  class="solr.highlight.HtmlFormatter">
>>>>>         <lst name="defaults">
>>>>>           <str name="hl.simple.pre"><![CDATA[<em>]]></str>
>>>>>           <str name="hl.simple.post"><![CDATA[</em>]]></str>
>>>>>         </lst>
>>>>>       </formatter>
>>>>>
>>>>>       <!-- Configure the standard encoder -->
>>>>>       <encoder name="html"
>>>>>                class="solr.highlight.HtmlEncoder" />
>>>>>
>>>>>       <!-- Configure the standard fragListBuilder -->
>>>>>       <fragListBuilder name="simple"
>>>>>                        class="solr.highlight.SimpleFragListBuilder"/>
>>>>>
>>>>>       <!-- Configure the single fragListBuilder -->
>>>>>       <fragListBuilder name="single"
>>>>>                        class="solr.highlight.SingleFragListBuilder"/>
>>>>>
>>>>>       <!-- Configure the weighted fragListBuilder -->
>>>>>       <fragListBuilder name="weighted"
>>>>>                        default="true"
>>>>>                        class="solr.highlight.WeightedFragListBuilder"/>
>>>>>
>>>>>       <!-- default tag FragmentsBuilder -->
>>>>>       <fragmentsBuilder name="default"
>>>>>                         default="true"
>>>>>
>>>>> class="solr.highlight.ScoreOrderFragmentsBuilder">
>>>>>         <!--
>>>>>         <lst name="defaults">
>>>>>           <str name="hl.multiValuedSeparatorChar">/</str>
>>>>>         </lst>
>>>>>         -->
>>>>>       </fragmentsBuilder>
>>>>>
>>>>>       <!-- multi-colored tag FragmentsBuilder -->
>>>>>       <fragmentsBuilder name="colored"
>>>>>
>>>>> class="solr.highlight.ScoreOrderFragmentsBuilder">
>>>>>         <lst name="defaults">
>>>>>           <str name="hl.tag.pre"><![CDATA[
>>>>>                <b style="background:yellow">,<b
>>>>> style="background:lawgreen">,
>>>>>                <b style="background:aquamarine">,<b
>>>>> style="background:magenta">,
>>>>>                <b style="background:palegreen">,<b
>>>>> style="background:coral">,
>>>>>                <b style="background:wheat">,<b
>>>>> style="background:khaki">,
>>>>>                <b style="background:lime">,<b
>>>>> style="background:deepskyblue">]]></str>
>>>>>           <str name="hl.tag.post"><![CDATA[</b>]]></str>
>>>>>         </lst>
>>>>>       </fragmentsBuilder>
>>>>>
>>>>>       <boundaryScanner name="default"
>>>>>                        default="true"
>>>>>                        class="solr.highlight.SimpleBoundaryScanner">
>>>>>         <lst name="defaults">
>>>>>           <str name="hl.bs.maxScan">10</str>
>>>>>           <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
>>>>>         </lst>
>>>>>       </boundaryScanner>
>>>>>
>>>>>       <boundaryScanner name="breakIterator"
>>>>>
>>>>> class="solr.highlight.BreakIteratorBoundaryScanner">
>>>>>         <lst name="defaults">
>>>>>           <!-- type should be one of CHARACTER, WORD(default), LINE
>>>>> and SENTENCE -->
>>>>>           <str name="hl.bs.type">WORD</str>
>>>>>           <!-- language and country are used when constructing Locale
>>>>> object.  -->
>>>>>           <!-- And the Locale object will be used when getting
>>>>> instance of BreakIterator -->
>>>>>           <str name="hl.bs.language">en</str>
>>>>>           <str name="hl.bs.country">US</str>
>>>>>         </lst>
>>>>>       </boundaryScanner>
>>>>>     </highlighting>
>>>>
>>>>
>>>
>>>
>>> ---
>>> This email has been checked for viruses by Avast antivirus software.
>>> https://www.avast.com/antivirus
>>>
>>>
>>
>
> --
> One man's "magic" is another man's engineering. "Supernatural" is a null
> word.”
> Robert A. Heinlein
>

Mime
View raw message