lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Libbrecht <p...@activemath.org>
Subject Re: trying to use the highlighter
Date Mon, 06 Sep 2010 06:21:06 GMT
ping!
Any hope for help here?
I'm a bit stuck before deploying a release.

thanks in advance

paul


On 3 sept. 2010, at 14:05, Paul Libbrecht wrote:

> 
> Hello list,
> 
> I'm strugging again with the highlighter. I don't understand why I obtain sporadically
InvalidTokenOffsetsException.
> 
> The mission: given a query, detect which field was matched, among the names of the concepts:
there can be several names for a given concept, also in one language. Concepts are documents
and names are in fields name-xx where xx is the two-letter-language.
> 
> Here's the method I'm using:
> 
>    public String computeMatchedField(int docNum, Document doc, Analyzer analyzer, Query
query) throws IOException {
>        //System.out.println("----- computing matched field for query " + query + " on
document " + doc.get("uri"));
>        query = query.rewrite(this.reader);
>        String found = null;
>        float maxScore = 0;
>        try {
>            for(Field f: (List<Field>) doc.getFields()) {
>                QueryScorer scorer = new QueryScorer(query,reader,f.name());
>                if(!f.name().startsWith("name-")) continue;
>                //System.out.println("Measuring field " + f.name() + ": " + f.stringValue());
>                String text = f.stringValue();
>                TokenStream tokenStream = TokenSources.getAnyTokenStream(reader,docNum,
f.name(), doc, analyzer);
>                SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter();
>                Highlighter highlighter = new Highlighter(htmlFormatter, scorer);
>                TextFragment[] frags = highlighter.getBestTextFragments(tokenStream, text,
false, 1);
>                if(frags==null || frags.length==0) continue;
>                float score = frags[0].getScore();
>                //System.out.println("Score: " + score);
>                if(score > maxScore) {
>                    maxScore = score;
>                    found = frags[0].toString();
>                }
>            }
>        } catch(Exception ex) {ex.printStackTrace();}
>        return found;
>    }
> 
> Unfortunately, I have to catch InvalidTokenOffsetsException which does happen sometimes,
not always.
> When it occurs, it stops the highlighting (the detected field is "null") and also costs
quite some time.
> 
> What am I doing wrong?
> I tried making my own tokenStream with no difference.
> 
> thanks in advance
> 
> paul
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message