lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pierrick Brihaye <>
Subject Re: derive tokens from single token
Date Mon, 29 Sep 2003 14:14:42 GMT

MOYSE Gilles (Cetelem) a écrit:

> isn't this one more secure ?

> 	       //New token ?
> 	if (receivedToken == null) return null;
> 	if (receivedText.length() == 0) {
>          receivedToken =;			
>          receivedText.append(receivedToken.termText());		
>          positionIncrement = 1;		
>        }

I don't think so. The aim of this method is to "substream" the main 
stream :-) i.e. output several tokens when just one is received (see 
thread's object).

In other terms, we shall not consume a token until the current token is 
itself entirely consumed, i.e. receivedText.length() == 0.

When the currentToken is consumed, we shall immediately return null if 
we receive a null Token (i.e. EOS). That's why this statement is 
*inside* a successful test for current token consumption.

I must reckognize that the use of a string buffer is maybe not the best 
way to do. I must also reckognize that I have to be *very* confident in 
the getNextTruncation() method :-)

Well, my code snippet was to demonstrate :

1) how a "substream" can be handled (remember : I tried to extend 
TokenFilter, but all I get is either "oobar" or "obar", depends on when 
'return' is called)

2) how these tokens will be meited at the same position, thus permitting 
efficient queries.


Pierrick Brihaye, informaticien
Service régional de l'Inventaire
DRAC Bretagne

View raw message