lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen Greene" <SGre...@metalseconomics.com>
Subject RE: Term offsets for highlighting
Date Thu, 22 Apr 2010 14:52:21 GMT
Hi Koji,

Thank you again for your continued assistance.
The code below details the code I used in Lucene 2.4 to highlight terms
(which did not correctly highlight terms).
>From your previous email, is there a way to access a TermVector
containing only matched terms, or is my previous approach still the
correct way to proceed.

Best,

Steve


public HitTermTagger(Scorer pScorer) {
        _mScorer = pScorer;
    }

    public ArrayList<KeyValuePair<Integer,Integer>> tagText(TokenStream
ptsTokenStream)
    {
        StringBuilder sbTaggedText = new StringBuilder();

        final Token reusableToken = new Token();
  
        int startOffset = -1;
        int endOffset = -1;
        float score;
        ArrayList<KeyValuePair<Integer,Integer>> results =
                new ArrayList<KeyValuePair<Integer,Integer>>();

        TokenGroup tokenGroup = new TokenGroup();

        //initialize scorer
        _mScorer.startFragment(null);

        try {
            for(Token nextToken = ptsTokenStream.next(reusableToken);
                (nextToken != null);
                nextToken = ptsTokenStream.next(reusableToken))
            {
                //if((tokenGroup.getNumTokens() >
0)&&(tokenGroup.isDistinct(nextToken))) {
                if(nextToken.startOffset() > endOffset) {
                    score = _mScorer.getTokenScore(nextToken);
                    if(score > 0.0) {
                        startOffset = nextToken.startOffset();
                        endOffset = nextToken.endOffset();
                        results.add(new
KeyValuePair<Integer,Integer>(startOffset,endOffset));
                    }
                    //tokenGroup.clear();
                }
 
//tokenGroup.addToken(nextToken,_mScorer.getTokenScore(nextToken));
            }
        }
-----Original Message-----
From: Koji Sekiguchi [mailto:koji@r.email.ne.jp] 
Sent: Monday, April 19, 2010 9:02 PM
To: java-user@lucene.apache.org
Subject: Re: Term offsets for highlighting

Stephen Greene wrote:
> Hi Koji,
>
> An additional question. Is it possible to access the FieldTermStack
from
> the FastVectorHighlighter after the it has been populated with
matching
> terms from the field?
>
> I think this would provide an ideal solution for this problem, as
> ultimately I am only concerned with returning positional offsets to
have
> highlighting tags applied to them in a separate process. 
>
> Thank you for your insight,
>
> Steve
>   
Hi Steve,

You cannot access FieldTermStack from FVH, but I think you
can create it by your own. To know how to do it, please refer to
FieldTermStackTest.java. To instantiate FieldTermStack, FieldQuery
object is needed. And FieldQuery object can be obtained from FVH.

But I don't understand why you need FieldTermStack. Just using
Lucene's TermVector with offsets (and positions, if necessary) doesn't
solve your problem?

Koji

-- 
http://www.rondhuit.com/en/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message