lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cassandra Targett (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (SOLR-8334) Highlighting content field problem when using JiebaTokenizerFactory
Date Thu, 29 Sep 2016 20:25:21 GMT

     [ https://issues.apache.org/jira/browse/SOLR-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Cassandra Targett closed SOLR-8334.
-----------------------------------
    Resolution: Invalid

The JiebaToknizerFactory is not maintained by the Lucene/Solr community. Instead, it appears
to come from this project: https://github.com/sing1ee/jieba-solr. It may be more helpful to
ask in that project.

> Highlighting content field problem when using JiebaTokenizerFactory
> -------------------------------------------------------------------
>
>                 Key: SOLR-8334
>                 URL: https://issues.apache.org/jira/browse/SOLR-8334
>             Project: Solr
>          Issue Type: Bug
>          Components: highlighter, search
>    Affects Versions: 5.3
>         Environment: Windows 8.1, Solr 5.3, ZooKeeper 3.4.6, jieba-analysis-1.0.0
>            Reporter: Yeo Zheng Lin
>              Labels: patch
>         Attachments: JiebaSegmenter.java
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> When I tried to use the JiebaTokenizerFactory to index Chinese characters in Solr, it
works fine with the segmentation when I'm using the Analysis function on the Solr Admin UI.
> However, when I tried to do the highlighting in Solr, it is not highlighting in the correct
place. For example, when I search of 自然环境与企业本身, it highlight 认<em>为自然环</em><em>境</em><em>与企</em><em>业本</em>身的
> Even when I search for English character like  responsibility, it highlight  <em>
responsibilit<em>y.
> Basically, the highlighting goes off by 1 character/space consistently.
> This problem only happens in content field, and not in any other fields.
> I've made some minor modification in the code under JiebaSegmenter.java, and the highlighting
seems to be fine now.
> Basically, I created another int called offset2 under process() method.
> int offset2 = 0; 
> After which, I modified the offset to offset2 for this part of the code under process()
method. 
> The changes are in the attachment below.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message