james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Burrell Donkin <robertburrelldon...@gmail.com>
Subject Re: JavaCC experience anyone?
Date Thu, 07 May 2009 21:25:02 GMT
On Thu, May 7, 2009 at 10:13 PM, Markus Wiederkehr
<markus.wiederkehr@gmail.com> wrote:
> On Thu, May 7, 2009 at 11:09 PM, Robert Burrell Donkin
> <robertburrelldonkin@gmail.com> wrote:
>> On Mon, May 4, 2009 at 4:43 PM, Markus Wiederkehr
>> <markus.wiederkehr@gmail.com> wrote:
>>> I'd like to address MIME4J-130 but I'm not sure how to do it..
>>> For MIME4J-130 special characters would have to allowed in the
>>> display-name portion of a "name-addr" or "group"
>>> (http://tools.ietf.org/html/rfc5322#section-3.4). A display-name is a
>>> "phrase" and AddressListParser.jjt defines it as:
>>>  void phrase() :
>>>  {}
>>>  {
>>>  (     <DOTATOM>
>>>  |     <QUOTEDSTRING>
>>>  )+
>>>  }
>>> So this production would have to be relaxed - but only when parsing an
>>> address in its decoded form. The production should stay the same when
>>> parsing an address in its raw transport-encoded form..
>>> The question is, is it possible with JavaCC to say: use production P1
>>> if the parser has been initialized with a certain constructor
>>> parameter, otherwise use production P2. I don't think that a JAVACODE
>>> production (https://javacc.dev.java.net/doc/javaccgrm.html#JAVACODE)
>>> could do the trick because they cannot appear at choice points.
>> no,  don't think that'll work
>>> Is it maybe possible to define two parser that share some common productions?
>> i don't know
>> i think two parsers will be necessary since one will need to have
>> UNICODE_INPUT=true, the other UNICODE_INPUT=false
> Copying the source file and changing only a single production does not
> seem very elegant.

that i don't dispute

i've had quite a lot of success with JavaCC but i always seem to have
to fiddle around a lot...

> Would it be a suitable option to simply make the current parser more
> lenient and use it for both purposes?

not sure whether needing to set UNICODE_INPUT=true would prevent this

an easy way to test this would be to create a pathological encoded
test case and compare just the effect of setting this one and off

- robert

View raw message