lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wettin <karl.wet...@gmail.com>
Subject Re: Changing the Punctuation definition for StandardAnalyzer
Date Thu, 20 Dec 2007 17:58:06 GMT

20 dec 2007 kl. 18.43 skrev tareque@controldocs.com:

> I am using StandardAnalyzer for my indexes. Now I don't want to be  
> able to
> be search whole email addresses, and want to consider '@' as a  
> punctuation
> too. Because my users would rather be able to search for user id and/ 
> or
> the host name to return all the email addresses than searching by the
> whole address. And, that way, then can create a query that will return
> email addresses anyway.
>
> How do I let StandardAnalyzer consider '@' as a punctuation?

A quick and dirty solution is to introduce a TokenFilter that splits  
any token at @ and add it to the end of the chain of streams in  
StandardAnalyzer#tokenStream.

It would probably be much more efficient if you modified the lexer  
grammar StandardTokenzier is generated from.

-- 
karl

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message