lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Paul Sondag" <jsond...@uiuc.edu>
Subject Retrieve nearest token based off location in original Text
Date Tue, 03 Jul 2007 17:49:48 GMT
Hi,

I was wondering if it's possible to get the token offset based of the
position in the original text.

My problem is I'm working on my own "Snippet Generator" and I'm giving a
token index (call it t) as input and need to make a snippet of the original
text.  I want the Snippet to be some number of tokens (call it n tokens).
But to make the Snippet easier to read I want to see if it's close to the
end of a paragraph (if it is I'll make more of the Snippet before the token
than usual).  So I'm scanning the original text forward some number of
characters looking for a new line or tab.  If I find it I'd like to get the
token before that new line (and it's offset, call it y).  Once I have the
offset I know I have y - t tokens after my token, and finally I know I put
n-(y-t) tokens before my token and can successfully make my Snippet.

Thanks in advance!

--JP

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message