lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: StandardAnalyzer & e-mail addresses
Date Wed, 08 Oct 2003 19:53:30 GMT
Christoph,

Thanks for that, but unfortunately it didn't change things.  I guess 
its time to push JavaCC learning into my to-learn queue :)

	Erik


On Wednesday, October 8, 2003, at 01:31  PM, Christoph Goller wrote:

> I am no a JavaCC-expert either. Maybe it´s a precedence problem.
> Could you try
>
> | <EMAIL: <ALPHANUM> (("."|"-"|"_") <ALPHANUM>)+ "@" <ALPHANUM>

> (("."|"-") <ALPHANUM>)+ >
>
>
> Christoph
>
>
> Erik Hatcher schrieb:
>> I'm not JavaCC-savvy enough (yet), but it seems there is a flaw in 
>> the StandardTokenizer and its determination of e-mail addresses.
>> If I analyze "xyz@example.com", it splits into two tokens: 
>> "xyz@example" and "com".  Shouldn't this rule:
>>   // email addresses
>> | <EMAIL: <ALPHANUM> ("."|"-"|"_" <ALPHANUM>)+ "@" <ALPHANUM>

>> ("."|"-" <ALPHANUM>)+ >
>> Be clever enough to keep the .com with it?  Perhaps some other 
>> parsing is taking precedence?
>> Thanks,
>>     Erik
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
> -- 
> *****************************************************************
> * Dr. Christoph Goller       Tel.:   +49 89 203 45734           *
> * Detego Software GmbH       Mobile: +49 179 1128469            *
> * Keuslinstr. 13             Fax.:   +49 721 151516176          *
> * 80798 München, Germany     Email:  goller@detego-software.de  *
> *****************************************************************
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message