lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nbuso <nb...@ebi.ac.uk>
Subject Re: Highlighting and InvalidTokenOffsetsException in Lucene 4.0
Date Wed, 28 Nov 2012 10:44:18 GMT
Scott Smith <ssmith <at> mainstreamdata.com> writes:

> 
> I'm migrating code from Lucene 3.5 to 4.0.  I have the following code which is
supposed to highlight text.  I
> get the exception InvalidTokenOffsetsException.  I have no idea what that
means.  I am using a custom
> analyzer which seems to work for searching/indexing, so I assume it will work
here (even though it took a
> couple of "minor" changes to get it to compile in 4.0  This code used to work
in 3.5.
> 
> Anyone have any ideas?
> 
> Scott
> 
> Code fragment:
> 
>         try
>         {
>             ctf = new CachingTokenFilter(myCustomAnalyzer
>                     .tokenStream(MyFieldName, new StringReader(myText)));
>         }
>         catch (IOException e1)
>         {
>             s_oLog.error("Search:markCommon: Exception creating
CachingTokenFilter: " +
>                     e1.getMessage());
>             return null;
> 
>         }
>         String markedString;
>         SimpleHTMLFormatter formatter;
>         try
>         {
>             formatter = new SimpleHTMLFormatter(_zBeginHighlight,
>                     _zEndHighlight);
>             Scorer score = new QueryScorer(q);
>             ht = new Highlighter(formatter, score);
>             ht.setTextFragmenter(new NullFragmenter());
>             markedString = ht.getBestFragment(ctf, myText);
>         }
>         catch (IOException e)
>         {
>             s_oLog.error("Search:markCommon: Unable to highlight string: "
>                     + e.getMessage());
>             return null;
>         }
>         catch(InvalidTokenOffsetsException e2)
>         {
>             s_oLog.error("Search:markCommon: Unable to highlight string2: "
>                     + e2.getMessage());
>             return null;
>         }
> 
> 

Hi Scott,

did you resolve? I'm new to Lucene than I don't know if this can be of real help.

CachingTokenFilter.reset() method is not calling your "custom" TokenStream
reset() method. Did you tried:

----
TokenStream ts = myCustomAnalyzer
  .tokenStream(MyFieldName, new StringReader(myText));
ts.reset();
ctf = new CachingTokenFilter(ts);
----

probably there is a better way to use the ChachingTokenFilter.


N.




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message