lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jae Joo <jaejo...@gmail.com>
Subject PatternReplaceCharFilterfactor and Position
Date Tue, 14 Jul 2015 20:38:59 GMT
I am having some issue regarding "start" and "End" position of token.
Here is the CharFilterFactory.

<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="&lt;/?
*ce(bold|sup|inf|hsp|vsp|italic)[^>]*>" replacement="X"/>


Then the input data is

<ce:sup loc=\"post\">1</ce:sup>

In the Analysis page,
textraw_bytesstartendpositionLengthtypeposition
1[31]21311word1

Should the "end" position "22"? It breaks the Highlighting...
HTMLStripCharFilterFactory is working properly

Any help?


Jae

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message