lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pierrick Brihaye <pierrick.brih...@culture.gouv.fr>
Subject Re: multiple tokens from a single input token
Date Mon, 10 Nov 2003 15:14:48 GMT
Hi,

MOYSE Gilles (Cetelem) a écrit:

> I experienced the same problem, and I used the following solution (maybe not
> the good one, but it works, and not too slowly).
> The problem was to detect synonyms. I used a synonyms file, made up of that
> kind of lines :
> 	a b c
> 	d e f

Mmmh... 1 for 1. The question was deliberatly a 1 to N tokenization. 
Anyway...

> I used a FIFO stack to solve that.

Yes : the "token stack" does the trick. My code was actually a token 
stack but... less beautiful (and more generic) than the code provided 
just a bit later :-)

> When the filter receives a token, it checks whether the stack is empty or
> not. If it is, then it returns the received token. If it is not empty, then
> it returns the poped (i.e. the first which was pushed. It's better to use a
> FIFO stack to keep a correct order) value from the stack.
> When you receive the 'null' token, indicating the end of stream, then you
> continue returning the poped values from yoour stack until it is empty. Then
> you return 'null'.

That's it.

Please do notice that the stack is necessarily declared outside of the 
next() method, i.e. it is an global instance variable. Maybe Peter 
Keegan missed this point ?

Cheers,

-- 
Pierrick Brihaye, informaticien
Service régional de l'Inventaire
DRAC Bretagne
mailto:pierrick.brihaye@culture.gouv.fr


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message