lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Kolter <>
Subject Getting left and right offsets of term search results
Date Fri, 09 Oct 2009 16:11:33 GMT
I am quite new to Lucene, but I have searched the FAQs and consulted
the mailinglist archive. I debugged through the source codes as well.

I have writen an Analyzer, that analyzes a stream by sending it to a
whole pipeline of linguistic processing and uses the internal
representation to construct a TokenStream, that tokenizes chunks
(semantic units). The Term-Attribute String hold the abstract
representations of those units. For further uses (for instance:
highlighting the results in text), I need access to the
OffsetAttribute, that I defined in my TokenStream implementation. Like
in StandardTokenizer I defined an OffsetAttribute to save the left and
right values of the original chunks.

Now I want to search for all documents containing an
"AdjectivePhrase", get those APs from the Documents and highlight all
APs in the found documents.

I tried to find results by getting TermPositions with
"Reader.termPositions(term)" and then iterate over the positions, but
the positions only represent the left offset.

Is there another function to get structured results from term queries
over documents, where I can get the whole set of attributes, that I
constructed in the TokenStream with addAttribute(Class)? I did not
find such a function, but I guess I dont know all retrieval methods of
Lucene, yet. For my search I used the IndexSearcher.

Till Kolter

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message