lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shyam Bhaskaran (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-3110) Search result comes up with truncated words at the start of highlighted fragment
Date Wed, 08 Feb 2012 07:34:59 GMT

     [ https://issues.apache.org/jira/browse/SOLR-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shyam Bhaskaran updated SOLR-3110:
----------------------------------

    Description: 
It is being observed that words are getting truncated at the start of Highlighter fragment
displayed. 
Following boundary scanner settings are introduced inside in the solrconfig.xml file

<str name="hl.bs.chars">.,!? '&#9;''&#10;''&#13;'</str> 

If I change the settings to <str name="hl.bs.chars">.,!?</str> 

then it is seen that this issue goes away but another issues comes up where the highlighted
search fragment does not start from the beginning of the sentence.

Below is the complete list of setting we are using for boundary scanner.

   <boundaryScanner name="simple" class="solr.highlight.SimpleBoundaryScanner" default="true">
     <lst name="defaults">
       <str name="hl.bs.maxScan">200</str>
       <str name="hl.bs.chars">.,!? '&#9;''&#10;''&#13;'</str>
     </lst>
   </boundaryScanner>

   <boundaryScanner name="breakIterator" class="solr.highlight.BreakIteratorBoundaryScanner">
     <lst name="defaults">
       <str name="hl.bs.type">SENTENCE</str>
       <str name="hl.bs.language">en</str>
       <str name="hl.bs.country">US</str>
     </lst>
   </boundaryScanner>



  was:
It is being observed that words are getting truncated at the start of Highlighter fragment
displayed. 
Following boundary scanner settings are introduced inside in the solrconfig.xml file

<str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str> 

If I change the settings to <str name="hl.bs.chars">.,!?</str> 

then it is seen that this issue goes away but another issues comes up where the highlighted
search fragment does not start from the beginning of the sentence.

Below is the complete list of setting we are using for boundary scanner.

   <boundaryScanner name="simple" class="solr.highlight.SimpleBoundaryScanner" default="true">
     <lst name="defaults">
       <str name="hl.bs.maxScan">200</str>
       <str name="hl.bs.chars">.,!? &#9;&#10;&#13;</str>
     </lst>
   </boundaryScanner>

   <boundaryScanner name="breakIterator" class="solr.highlight.BreakIteratorBoundaryScanner">
     <lst name="defaults">
       <str name="hl.bs.type">SENTENCE</str>
       <str name="hl.bs.language">en</str>
       <str name="hl.bs.country">US</str>
     </lst>
   </boundaryScanner>



    
> Search result comes up with truncated words at the start of highlighted fragment
> --------------------------------------------------------------------------------
>
>                 Key: SOLR-3110
>                 URL: https://issues.apache.org/jira/browse/SOLR-3110
>             Project: Solr
>          Issue Type: Bug
>          Components: highlighter
>    Affects Versions: 4.0
>         Environment: java Tomcat Solaris
>            Reporter: Shyam Bhaskaran
>              Labels: FastVectorHighlighter, boundaryScanner, highlighting, solr
>
> It is being observed that words are getting truncated at the start of Highlighter fragment
displayed. 
> Following boundary scanner settings are introduced inside in the solrconfig.xml file
> <str name="hl.bs.chars">.,!? '&#9;''&#10;''&#13;'</str> 
> If I change the settings to <str name="hl.bs.chars">.,!?</str> 
> then it is seen that this issue goes away but another issues comes up where the highlighted
search fragment does not start from the beginning of the sentence.
> Below is the complete list of setting we are using for boundary scanner.
>    <boundaryScanner name="simple" class="solr.highlight.SimpleBoundaryScanner" default="true">
>      <lst name="defaults">
>        <str name="hl.bs.maxScan">200</str>
>        <str name="hl.bs.chars">.,!? '&#9;''&#10;''&#13;'</str>
>      </lst>
>    </boundaryScanner>
>    <boundaryScanner name="breakIterator" class="solr.highlight.BreakIteratorBoundaryScanner">
>      <lst name="defaults">
>        <str name="hl.bs.type">SENTENCE</str>
>        <str name="hl.bs.language">en</str>
>        <str name="hl.bs.country">US</str>
>      </lst>
>    </boundaryScanner>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message