lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aviran" <>
Subject RE: Faster highlighting with TermPositionVectors
Date Wed, 03 Nov 2004 21:10:47 GMT
Did anyone tried this class ?

I tried this class but I can't make it to work I indexed a field as new
Field("description", description,true,true,true,true); but when I call 
TokenSources.getTokenStream(_indexReader,i,"description"); I get

In this class the line TermPositionVector tpv=(TermPositionVector)
reader.getTermFreqVector(docId,field); is trying to cast SegmentTermVector
to TermPositionVector. 
Is there anything I'm doing wrong. Should I have indexed the field some
other way to store TermPositionVector ?

BTW: I'm using the latest lucene source from CVS.


-----Original Message-----
From: Bruce Ritchie [] 
Sent: Friday, October 29, 2004 1:15 AM
To: Lucene Users List
Subject: RE: Faster highlighting with TermPositionVectors


> Thanks to the recent changes (see CVS) in TermFreqVector
> support we can now make use of term offset information held 
> in the Lucene index rather than incurring the cost of 
> re-analyzing text to highlight it.
> I have created a  class ( see
> ) which 
> handles creating a TokenStream from the TermPositionVector 
> stored in the database which can then be passed to the highlighter.
> This approach is significantly faster than re-parsing the 
> original text.
> If people are happy with this class I'll add it to the 
> Highlighter sandbox but it may sit better elsewhere in the 
> Lucene code base as a more general purpose utility.
> BTW as part of putting this together I found that the
> TermFreq code throws a null pointer when indexing fields that 
> produce no tokens (ie empty or all stopwords). Otherwise 
> things work very well.

This is great news! While I won't have the time to test this until probably
mid November I do look forward to the speed improvements as the current
highlighting mechanisms (reparsing the text) was just not performant enough
under heavy loads.


Bruce Ritchie

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message