lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean O'Connor <s...@oconeco.com>
Subject Example of Field.TermVector.WITH_POSITIONS_OFFSETS usage?
Date Tue, 23 Aug 2005 21:42:24 GMT
Hello,
    I am trying to work through term positions and how to get them from 
a collection of hits. Does setting TermVector.WITH_POSITIONS_OFFSETS to 
true save the start/end position of the term in the source text file? (I 
_think_ it does).

     If so, where would I start for trying to make that information 
accessible in a "result set"? I believe it would be extending a query, a 
scorer, a hit, and/or a weight object. I will be wanting to process ALL 
hits, so I think will need to implement a hitcollector.

    As an example of what I want, if I were looking for the offset 
position of "brown" in a properly indexed field containing "the lazy 
brown fox", I would like to get:
start==10
end==15 (assuming my counting is right)

    Based on Paul Elschot's previous response to a similar question I 
had (which I am still working on), I _think_ I need to extend something 
like the ExactPhraseScorer. While debugging with my IDE (Eclipse) I can 
see that the weight object in the scorer contains a reference to the 
query. The query contains the fields:
    Vector positions (just has ints of term positions in phrase?)
    Vector terms (vector of Term, just field name and field contents?)

    The weight also seems to have an array of TermPositions, which have 
SegmentTermPositions. I thought this was what I wanted, but I don't see 
the proper start/end fields, or anything which seems to be on the right 
track.

    Can anyone point me in the right direction?
Thanks,

Sean



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message