lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Paul Sondag" <>
Subject Re: Standard Tokenizer Question
Date Wed, 27 Jun 2007 02:04:20 GMT
That solves getting the actual text but Token.  My other problem is that
Token also has "startOffset" and "endOffset" fields.    Standard Token has
"startColumn/Line" and "endColumn/Line" but I was not exactly sure how to
use these.  Could you possibly give me a small example of using these?  I
think I would be able to if I a line is always the same length
(startLine-1(line_length) + startColumn)



On 6/26/07, <> wrote:
> Token.termText() perhaps is the same as st.getToken(y).image
> Andy
> -----Original Message-----
> From: [] On Behalf Of John
> Paul Sondag
> Sent: Wednesday, June 27, 2007 9:32 AM
> To:
> Subject: Standard Tokenizer Question
> Hey,
> I Think this is where I ask this.
> I'm pretty new to this so this is probably a dumb question.
> I'm using the StandardTokenizer class to turn a file into tokens.  I
> then
> need to be able to later skip to a specific token in the file sent to me
> from another source.  So say my StandardTokenizer is called st and I'm
> told
> to get token y.  At first I was using st.getToken(y).  But this was
> returning an object of type  "StandardToken", I really would like to
> just
> have it return type Token (because it has more useful functions for me
> like
> termText()).  Right now I just call  y times to get the y'th
> token, which is horribly inefficient but I can't find any way to
> a) go right to the token I want
> b)  Have it be type Token and not Standard Token.
> --JP
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message