lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pierrick Brihaye <pierrick.brih...@culture.gouv.fr>
Subject Re: derive tokens from single token
Date Mon, 29 Sep 2003 14:14:42 GMT
Hi,

MOYSE Gilles (Cetelem) a écrit:

> isn't this one more secure ?

> 	       //New token ?
> 	if (receivedToken == null) return null;
> 	if (receivedText.length() == 0) {
>          receivedToken = input.next();			
>          receivedText.append(receivedToken.termText());		
>          positionIncrement = 1;		
>        }

I don't think so. The aim of this method is to "substream" the main 
stream :-) i.e. output several tokens when just one is received (see 
thread's object).

In other terms, we shall not consume a token until the current token is 
itself entirely consumed, i.e. receivedText.length() == 0.

When the currentToken is consumed, we shall immediately return null if 
we receive a null Token (i.e. EOS). That's why this statement is 
*inside* a successful test for current token consumption.

I must reckognize that the use of a string buffer is maybe not the best 
way to do. I must also reckognize that I have to be *very* confident in 
the getNextTruncation() method :-)

Well, my code snippet was to demonstrate :

1) how a "substream" can be handled (remember : I tried to extend 
TokenFilter, but all I get is either "oobar" or "obar", depends on when 
'return' is called)

2) how these tokens will be meited at the same position, thus permitting 
efficient queries.

Cheers,

-- 
Pierrick Brihaye, informaticien
Service régional de l'Inventaire
DRAC Bretagne
mailto:pierrick.brihaye@culture.fr


Mime
View raw message