lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Weiwei Wang <ww.wang...@gmail.com>
Subject Re: FastVectorHighlighter StringIndexOutofBounds bug
Date Mon, 23 May 2011 05:36:42 GMT
1. source string: 777777777
2. WhitespaceTokenizer + EGramTokenFilter
3. FastVectorHighlighter,
4. debug info:  subInfos=(777((8,11))777((5,8))777((2,5)))/3.0(2,102),
srcIndex is not correctly computed for the second loop of the outer for-loop

2011/5/23 Weiwei Wang <ww.wang.cs@gmail.com>

> the following code has a bug of StringIndexOutofBounds when multiple
> matched terms need highlight
>
> private String makeFragment( WeightedFragInfo fragInfo, String src, int s,
>       String[] preTags, String[] postTags, Encoder encoder ){
>     StringBuilder fragment = new StringBuilder();
>     int srcIndex = 0;
>     for( SubInfo subInfo : fragInfo.subInfos ){
>       for( Toffs to : subInfo.termsOffsets ){
>         fragment
>           .append( encoder.encodeText( src.substring( srcIndex,
> to.startOffset - s ) ) )
>           .append( getPreTag( preTags, subInfo.seqnum ) )
>           .append( encoder.encodeText( src.substring( to.startOffset - s,
> to.endOffset - s ) ) )
>           .append( getPostTag( postTags, subInfo.seqnum ) );
>         srcIndex = to.endOffset - s;
>       }
>     }
>     fragment.append( encoder.encodeText( src.substring( srcIndex ) ) );
>     return fragment.toString();
>   }--
> 王巍巍
> Cell: 18911288489
> MSN: ww.wang.cs@gmail.com
> Blog: http://whisper.eyesay.org
> 围脖:http://t.sina.com/lolorosa
>
>


-- 
王巍巍
Cell: 18911288489
MSN: ww.wang.cs@gmail.com
Blog: http://whisper.eyesay.org
围脖:http://t.sina.com/lolorosa

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message