lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zheng Lin Edwin Yeo <edwinye...@gmail.com>
Subject Re: Highlighting content field problem when using JiebaTokenizerFactory
Date Tue, 20 Oct 2015 03:11:39 GMT
Hi Scott,

Thank you for your reply.

I've tried to set that and also tried changing to Fast Vector Highlighter,
but it isn't working as well. I got the same highlighting results as
previously.

Regards,
Edwin


On 19 October 2015 at 23:56, Scott Stults <sstults@opensourceconnections.com
> wrote:

> Edwin,
>
> Try setting hl.bs.language and hl.bs.country in your request or
> requestHandler:
>
>
> https://cwiki.apache.org/confluence/display/solr/FastVector+Highlighter#FastVectorHighlighter-UsingBoundaryScannerswiththeFastVectorHighlighter
>
>
> -Scott
>
> On Tue, Oct 13, 2015 at 5:04 AM, Zheng Lin Edwin Yeo <edwinyeozl@gmail.com
> >
> wrote:
>
> > Hi,
> >
> > I'm trying to use the JiebaTokenizerFactory to index Chinese characters
> in
> > Solr. It works fine with the segmentation when I'm using
> > the Analysis function on the Solr Admin UI.
> >
> > However, when I tried to do the highlighting in Solr, it is not
> > highlighting in the correct place. For example, when I search of
> 自然环境与企业本身,
> > it highlight 认<em>为自然环</em><em>境</em><em>与企</em><em>业本</em>身的
> >
> > Even when I search for English character like  responsibility, it
> highlight
> >  <em> *responsibilit<em>*y.
> >
> > Basically, the highlighting goes off by 1 character/space consistently.
> >
> > This problem only happens in content field, and not in any other fields.
> > Does anyone knows what could be causing the issue?
> >
> > I'm using jieba-analysis-1.0.0, Solr 5.3.0 and Lucene 5.3.0.
> >
> >
> > Regards,
> > Edwin
> >
>
>
>
> --
> Scott Stults | Founder & Solutions Architect | OpenSource Connections, LLC
> | 434.409.2780
> http://www.opensourceconnections.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message