lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephane Vaucher <vauch...@cirano.qc.ca>
Subject Re: Type information on Tokens?
Date Wed, 30 Apr 2003 23:20:02 GMT
I've already posted a message concerning meta info and am wondering if 
there might be some interest that there be (in the future) a standard way 
of holding (or encoding in the token) meta-info. I've got to store weights 
given to specific tokens (and perhaps even change the scoring for this. I 
won't go into details here because of my previous post ).

sv


On Tue, 29 Apr 2003, Doug Cutting wrote:

> Armbrust, Daniel C. wrote:
> > So far, I can only think of two ways to accomplish this, 1, is to 
> build it into my tokens, i.e. my tokens would look something like 
> "<noun>patient".  I'm afraid there may be some pit-falls with this 
> approach that I haven't identified yet, however, since I haven't tried 
> it out.
> 
> This should actually work fine, so long as you use the same analyzer on 
> your queries.  Another option would be to put each part of speech in a 
> different Lucene field.  I think the token-prefix option would be 
> preferable, since you probably don't need separate boost and 
> normalization factors for each part of speech.
> 
> Doug
> 
> 
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message