lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liu_An...@emc.com
Subject RE: Standard Tokenizer Question
Date Wed, 27 Jun 2007 02:17:58 GMT
In fact,it should be: 
Standardtoken.image=Token.termText,
Standardtoken.beginColumn= Token.startOffset,
Standardtoken.endColumn =Token.endOffset,

You can reference StandardTokenizer.java, about line 73.

Andy


-----Original Message-----
From: jpsondag@gmail.com [mailto:jpsondag@gmail.com] On Behalf Of John
Paul Sondag
Sent: Wednesday, June 27, 2007 10:04 AM
To: java-user@lucene.apache.org
Subject: Re: Standard Tokenizer Question

That solves getting the actual text but Token.  My other problem is that
Token also has "startOffset" and "endOffset" fields.    Standard Token
has
"startColumn/Line" and "endColumn/Line" but I was not exactly sure how
to
use these.  Could you possibly give me a small example of using these?
I
think I would be able to if I a line is always the same length
(startLine-1(line_length) + startColumn)

Thanks!

--JP

On 6/26/07, Liu_Andy2@emc.com <Liu_Andy2@emc.com> wrote:
>
> Token.termText() perhaps is the same as st.getToken(y).image
>
> Andy
> -----Original Message-----
> From: jpsondag@gmail.com [mailto:jpsondag@gmail.com] On Behalf Of John
> Paul Sondag
> Sent: Wednesday, June 27, 2007 9:32 AM
> To: java-user@lucene.apache.org
> Subject: Standard Tokenizer Question
>
> Hey,
>
> I Think this is where I ask this.
>
> I'm pretty new to this so this is probably a dumb question.
>
> I'm using the StandardTokenizer class to turn a file into tokens.  I
> then
> need to be able to later skip to a specific token in the file sent to
me
> from another source.  So say my StandardTokenizer is called st and I'm
> told
> to get token y.  At first I was using st.getToken(y).  But this was
> returning an object of type  "StandardToken", I really would like to
> just
> have it return type Token (because it has more useful functions for me
> like
> termText()).  Right now I just call st.next()  y times to get the y'th
> token, which is horribly inefficient but I can't find any way to
>
> a) go right to the token I want
> b)  Have it be type Token and not Standard Token.
>
> --JP
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message