lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: StandardTokenizer and e-mail
Date Fri, 21 May 2004 23:43:28 GMT
Further on this...

If you are using StandardTokenizer, the token for an e-mail address has 
the type value of "<EMAIL>", which you could use to pick up 
specifically in a custom TokenFilter implementation and split it how 
you like, passing through everything else.  Take a look at 
StandardFilter's source code for an example of keying off the types 
emitted by StandardTokenizer.

	Erik


On May 21, 2004, at 11:50 AM, Otis Gospodnetic wrote:

> Si, si.
> Write your own TokenFilter sub-class that overrides next() and extracts
> those other elements/tokens from an email address token and uses
> Token's setPositionIncrement(0) to store the extracted tokens in the
> same position as the original email.
>
> Otis
>
> --- Albert Vila <avp@imente.com> wrote:
>> Hi all,
>>
>> I want to achieve the following, when I indexing the
>> 'xyz@company.com',
>> I want to index the 'xyz@company.com' token, then the 'xyz' token,
>> the
>> 'company' token and the 'com'token.
>> This way, you'll be able to find the document searching for
>> 'xyz@company.com', for 'xyz' only, or for 'company' only.
>>
>> How can I achieve that?, I need to write my own tokenizer?
>>
>> Thanks
>> Albert
>>
>> -- 
>> Albert Vila
>> Director de proyectos I+D
>> http://www.imente.com
>> 902 933 242
>> [iMente “La información con más beneficios”]
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message