lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Paul Sondag" <jsond...@uiuc.edu>
Subject Re: Standard Tokenizer Question
Date Wed, 27 Jun 2007 02:31:31 GMT
So out of curiosity exactly what is startLine and endLine?

--JP

On 6/26/07, Liu_Andy2@emc.com <Liu_Andy2@emc.com> wrote:
>
> In fact,it should be:
> Standardtoken.image=Token.termText,
> Standardtoken.beginColumn= Token.startOffset,
> Standardtoken.endColumn =Token.endOffset,
>
> You can reference StandardTokenizer.java, about line 73.
>
> Andy
>
>
> -----Original Message-----
> From: jpsondag@gmail.com [mailto:jpsondag@gmail.com] On Behalf Of John
> Paul Sondag
> Sent: Wednesday, June 27, 2007 10:04 AM
> To: java-user@lucene.apache.org
> Subject: Re: Standard Tokenizer Question
>
> That solves getting the actual text but Token.  My other problem is that
> Token also has "startOffset" and "endOffset" fields.    Standard Token
> has
> "startColumn/Line" and "endColumn/Line" but I was not exactly sure how
> to
> use these.  Could you possibly give me a small example of using these?
> I
> think I would be able to if I a line is always the same length
> (startLine-1(line_length) + startColumn)
>
> Thanks!
>
> --JP
>
> On 6/26/07, Liu_Andy2@emc.com <Liu_Andy2@emc.com> wrote:
> >
> > Token.termText() perhaps is the same as st.getToken(y).image
> >
> > Andy
> > -----Original Message-----
> > From: jpsondag@gmail.com [mailto:jpsondag@gmail.com] On Behalf Of John
> > Paul Sondag
> > Sent: Wednesday, June 27, 2007 9:32 AM
> > To: java-user@lucene.apache.org
> > Subject: Standard Tokenizer Question
> >
> > Hey,
> >
> > I Think this is where I ask this.
> >
> > I'm pretty new to this so this is probably a dumb question.
> >
> > I'm using the StandardTokenizer class to turn a file into tokens.  I
> > then
> > need to be able to later skip to a specific token in the file sent to
> me
> > from another source.  So say my StandardTokenizer is called st and I'm
> > told
> > to get token y.  At first I was using st.getToken(y).  But this was
> > returning an object of type  "StandardToken", I really would like to
> > just
> > have it return type Token (because it has more useful functions for me
> > like
> > termText()).  Right now I just call st.next()  y times to get the y'th
> > token, which is horribly inefficient but I can't find any way to
> >
> > a) go right to the token I want
> > b)  Have it be type Token and not Standard Token.
> >
> > --JP
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message