geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick McGuire <>
Subject Re: Javamail address parsing (again).
Date Thu, 12 Jan 2006 23:24:12 GMT
Dain Sundstrom wrote:
> On Jan 11, 2006, at 1:17 PM, Bruce Snyder wrote:
>> Is it possible to look at the Sun implementation's source code to
>> distinguish enforced vs. ignored rules?
> That would make the code not clean room.
> I propose we ask Sun for a formal definition of the parser for this 
> class, and in a parallel track make an effort to try to match their 
> bugs.  The code from the second track doesn't have to be perfect, but 
> just good enough.  We simply let our users know that our goal with the 
> "implement.sun.javamail.bugs=true" code is to emulate the sun bugs, 
> and if they find something that produces different results for the 
> same text, we consider it a bug.
I'm becoming less and less convinced this is a good idea.  So far, I've 
found many, many sun bugs in this code where they produce results that 
are in conflict with with RFC822.  The API documentation refers relaxed 
parsing rules, which says to me there are addresses that would not be 
valid under RFC822, but javamail will accept them based on the type of 
parsing requested.  I can accept that.

However, the great majority of the problems I've found have been 
involved with internet addresses that RFC822 says ARE valid, but the 
javamail code does not handle them properly.  And there are a few 
situations where it appears the authors just chose to punt and say 
"yeah, whatever". 

It appears that the solution is to write hacked code that mostly, sorta, 
kinda does what it claims to do, or write a good parser, then triple the 
size of the code trying to get all of the Sun bugs to work properly. 

Working strictly from the RFC822 spec, I had a fairly nice parser 
written that gave very good RFC822 compliance, but things turned 
nightmarish when I discovered the sorts of Sun behaviors I had to insert 
back in.  I think I've completely rewritten this code about 5 times now, 
and am getting pretty close to the Sun "relaxed rules".  Inserting some 
of the real bugs back in to the parsing might pose similar problems. 

It really appears that this code somewhat "lost it's way somewhere".  
It's serving two purposes that are really at odds with each other.  The 
first purpose, is to parse any internet address that might appear in a 
received message.  For that purpose, the code needs to accept any valid 
internet address as defined by RFC822.  The Sun code does not currently 
do that, and making the new version "bug compatible" would also not 
achieve that.

The other purpose of the InternetAddress parser is to process email 
addresses entered into applications and perform some validation on the 
addresses.  This is where the "relaxed rules" come in to play, and 
basically allows internet addresses that are not strictly RFC822 
compatible to pass.  Now for those, I'm relatively comfortable that this 
can be made compatible.  It is very difficult though, when the 
requirement becomes one of being both more and less restrictive at the 
same time, with no good definition of the what rules are being used.

> -dain

View raw message