axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Veithen <andreas.veit...@gmail.com>
Subject Re: MIME Headers - Parsing Issues- AXIS2
Date Sat, 11 Jun 2011 17:36:18 GMT
The key is actually the following statement in the first section of RFC 2045:

"All of the header fields defined in this document are subject to the
general syntactic rules for header fields specified in RFC 822."

Multi-line header fields are defined in sections 3.1.1 and 3.2 of RFC
822. See AXIOM-366 for more details.

Andreas

On Tue, Jun 7, 2011 at 21:32, Anil Atyam <aanilkumar@gmail.com> wrote:
> Please see below formal definition: (Source RFC 2045)
>
> The formal definition of these header fields is as follows:
>
>      entity-headers := [ content CRLF ]
>                        [ encoding CRLF ]
>                        [ id CRLF ]
>                        [ description CRLF ]
>                        *( MIME-extension-field CRLF )
>
>      MIME-message-headers := entity-headers
>                              fields
>                              version CRLF
>                              ; The ordering of the header
>                              ; fields implied by this BNF
>                              ; definition should be ignored.
>
>      MIME-part-headers := entity-headers
>                           [ fields ]
>                           ; Any field not beginning with
>                           ; "content-" can have no defined
>                           ; meaning and may be ignored.
>                           ; The ordering of the header
>                           ; fields implied by this BNF
>                           ; definition should be ignored.
>
>    The syntax of the various specific MIME header fields will be
>    described in the following sections.
>
> In Specification, no where it tells us that ";" is a valid continuation
> character. CRLF terminates each entity-header as given entity-headers
> syntax.
>
> Please let me know if I can help further.
>
> Thanks
> Anil Kumar
>
> On Fri, Jun 3, 2011 at 5:18 PM, Andreas Veithen <andreas.veithen@gmail.com>
> wrote:
>>
>> On Fri, May 27, 2011 at 17:43, Anil Atyam <aanilkumar@gmail.com> wrote:
>> >
>> > Thanks Andreas.
>> >
>> > I wanted to propose the following after reading specifications from
>> > RFC2045
>> > to RFC2049. I thought the proposed parsing algorithm supplements
>> > existing
>> > parsing process. You being the programmer or AXIS2, you and your team
>> > have
>> > highest authority to make a final call on this. This is a proposal only.
>> >
>> > Every header in MIME part must be terminated by <CR><LF>. In our
ESB
>> > MIME
>> > case, the Content-Type header terminated with <CR><LF> but with
>> > semicolon at
>> > the end.
>> > As per my understanding (after reading spec) <CR><LF> is the key
to
>> > identify
>> > next header. This is what exactly done in readHeaders() method. There is
>> > a
>> > else block which checks for the parameter ending ";" and considering the
>> > next byte as part of the same header irrespective it is continued in the
>> > same line or next line.
>> >
>> > What I would like to propose, because every header in MIME part always
>> > start
>> > with "Content-", it may be a good idea to check if the next byte
>> > followed by
>> > <CR><LF> starts with "Content-". If yes, consider it as a new header
not
>> > as
>> > continuation to existing header especially if the continuation is
>> > terminated
>> > by <CR><LF>.
>> > In this case, If the first header terminates the line with semi-colon
>> > and
>> > new line doesn't start with "Content-", consider it as continuation.
>> > Otherwise, consider it as new header and process accordingly.
>>
>> What part of the specs suggests that an application should use this
>> kind of parsing logic to process the MIME headers?
>>
>> >
>> > I have found few example MIME messages where a content-type is
>> > terminated
>> > with semi-colon: (In first example, content-transfer-encoding must have
>> > been
>> > parsed as part of content-type and content-id is parsed seperately. ) I
>> > have
>> > no idea which version of MIME spec they meant in those examples
>> >
>> > Example 1
>> >
>> > http://publib.boulder.ibm.com/infocenter/wasinfo/v6r1/index.jsp?topic=/com.ibm.websphere.wsfep.multiplatform.doc/info/ae/ae/twbs_enablemtom.html
>> >
>> >  other transport headers ...
>> > Content-Type: multipart/related;
>> > boundary=MIMEBoundaryurn_uuid_0FE43E4D025F0BF3DC11582467646812;
>> > type="application/xop+xml"; start="
>> > <0.urn:uuid:0FE43E4D025F0BF3DC11582467646813@apache.org>";
>> > start-info="text/xml"; charset=UTF-8
>> >
>> > --MIMEBoundaryurn_uuid_0FE43E4D025F0BF3DC11582467646812
>> > content-type: application/xop+xml; charset=UTF-8; type="text/xml";
>> > content-transfer-encoding: binary
>> > content-id:
>> >    <0.urn:uuid:0FE43E4D025F0BF3DC11582467646813@apache.org>
>> >
>> > <?xml version="1.0" encoding="UTF-8"?>
>> >          <soapenv:Envelope
>> > xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
>> >             <soapenv:Header/>
>> >             <soapenv:Body>
>> >                <sendImage
>> > xmlns="http://org/apache/axis2/jaxws/sample/mtom">
>> >                   <input>
>> >                      <imageData>
>> >                         <xop:Include
>> > xmlns:xop="http://www.w3.org/2004/08/xop/include"
>> > href="cid:1.urn:uuid:0FE43E4D025F0BF3DC11582467646811@apache.org"/>
>> >                      </imageData>
>> >                   </input>
>> >                </sendImage>
>> >             </soapenv:Body>
>> >          </soapenv:Envelope>
>> > --MIMEBoundaryurn_uuid_0FE43E4D025F0BF3DC11582467646812
>> > content-type: text/plain
>> > content-transfer-encoding: binary
>> > content-id:
>> >          <1.urn:uuid:0FE43E4D025F0BF3DC11582467646811@apache.org>
>> >
>> > … binary data goes here …
>> > --MIMEBoundaryurn_uuid_0FE43E4D025F0BF3DC11582467646812--
>> >
>> > Example 2:
>> >
>> > http://msdn.microsoft.com/en-us/library/ms526560(v=exchg.10).aspx
>> >
>> >
>> >
>> > From: John Doe <example@example.com>
>> > MIME-Version: 1.0
>> > Content-Type: multipart/mixed;
>> >         boundary="XXXXboundary text"
>> >
>> > This is a multipart message in MIME format.
>> >
>> > --XXXXboundary text
>> > Content-Type: text/plain
>> >
>> > this is the body text
>> >
>> > --XXXXboundary text
>> > Content-Type: text/plain;
>> > Content-Disposition: attachment;
>> >         filename="test.txt"
>> >
>> > this is the attachment text
>> >
>> > --XXXXboundary text--
>> >
>> > On Thu, May 26, 2011 at 4:00 PM, Andreas Veithen
>> > <andreas.veithen@gmail.com>
>> > wrote:
>> >>
>> >> Strict interpretation of the specs would actually suggest that a
>> >> semicolon at the end of the content type is not valid. I think that
>> >> RFC 2045 applies here, which defines the syntax of the Content-Type
>> >> header as follows (see section 5.1):
>> >>
>> >> content := "Content-Type" ":" type "/" subtype
>> >>                *(";" parameter)
>> >>
>> >> This means that a semicolon must be followed by a parameter. Therefore
>> >> I would say that the ESB is not entirely compliant with the MIME
>> >> specification.
>> >>
>> >> On the other hand, the code in
>> >> org.apache.axiom.attachments.impl.PartFactory interprets a semicolon
>> >> at the end of a header as a continuation character. This code was
>> >> actually introduced by AXIOM-257 [1] and went into Axiom 1.2.8 (Do you
>> >> use an Axis2 version that ships with Axiom 1.2.8 or above?). What I
>> >> don't see is which part of the specs actually defines the semicolon as
>> >> a continuation character. Somebody else recently came up with an issue
>> >> related to multi-line headers [2], and this may somehow be related.
>> >>
>> >> Andreas
>> >>
>> >> [1] https://issues.apache.org/jira/browse/AXIOM-257
>> >> [2] http://markmail.org/thread/guj44ez5wdnxjobc
>> >>
>> >> On Thu, May 26, 2011 at 16:10, Anil Atyam <aanilkumar@gmail.com> wrote:
>> >> > AXIS2 Committors-
>> >> >
>> >> >
>> >> > It appears that there may be a bug in parsing MIME body parts. I have
>> >> > downloaded the source code and included additional logging. Please
>> >> > see
>> >> > the
>> >> > below details.
>> >> >
>> >> > Brief History:
>> >> > The first MIME header (Not Working) comes from ESB server which in
>> >> > turn
>> >> > invoked a web service on IIS server (.net framework). JBOSS powered
>> >> > by
>> >> > AXIS2
>> >> > --> ESB --> IIS
>> >> > Second MIME Header (Working) comes directly from IIS server. JBOSS
>> >> > powered
>> >> > by AXIS2 --> IIS (.net Framework)
>> >> >
>> >> > I will be glad If you can confirm by running the first MIME header
>> >> > using
>> >> > the
>> >> > tools you have to see if you can obtain the content type. There is
NO
>> >> > Commercial Product involved in this particular prototype scenario.
I
>> >> > am
>> >> > using AXIS2 downloaded from APACHE website along with source code.
>> >> >
>> >> > I do not know If this is a bug in AXIS2 Parser or the content type
>> >> > must
>> >> > not
>> >> > have been terminated by semi colon. Appreciate if you can comment
>> >> > your expertise. You can very well ignore this question if it is out
>> >> > of
>> >> > scope
>> >> > from open source support.
>> >> >
>> >> >
>> >> > NOT Working MIME Header (Extra logging given in getPart() Method) -
>> >> > Please
>> >> > notice content-type is ending with semi-colon. I think the parser
>> >> > assuming
>> >> > content-ID is part of content-type and hence part.getContentID()
>> >> > returned
>> >> > NULL (highlighted in RED)
>> >> >
>> >> >
>> >> > [5/25/11 16:33:51:472 EDT] 00000029 Attachments   I   *** getPart
:
>> >> > return
>> >> > part-> part.getContentID(): null part.getContentType():
>> >> > application/xop+xml;charset=utf-8;type="text/xml";Content-ID:
>> >> > <0.634419380311847897@example.org> part.toString():
>> >> > org.apache.axiom.attachments.impl.PartOnMemoryEnhanced@443c443c
>> >> >
>> >> >
>> >> >
>> >> > ----MIMEBoundary634419380311847897
>> >> >
>> >> > Content-Type: application/xop+xml;type="text/xml";
>> >> >
>> >> > Content-ID: <0.634419380311847897@example.org>
>> >> >
>> >> > content-transfer-encoding: binary
>> >> >
>> >> >
>> >> >
>> >> > Working MIME Header (Extra logging given in getPart() Method)
>> >> >
>> >> >
>> >> > [5/25/11 16:40:07:195 EDT] 00000029 Attachments   I   *** getPart
:
>> >> > return
>> >> > part-> part.getContentID(): <0.634419384069456049@example.org>
>> >> > part.getContentType(): application/xop+xml; charset=utf-8;
>> >> > type="text/xml;
>> >> > charset=utf-8" part.toString():
>> >> > org.apache.axiom.attachments.impl.PartOnMemoryEnhanced@27f427f4
>> >> >
>> >> >
>> >> >
>> >> > ----MIMEBoundary634419384069456049
>> >> >
>> >> > content-id: <0.634419384069456049@example.org>
>> >> >
>> >> > content-type: application/xop+xml; charset=utf-8; type="text/xml;
>> >> > charset=utf-8"
>> >> >
>> >> > content-transfer-encoding: binary
>> >> >
>> >> >
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-dev-unsubscribe@axis.apache.org
>> >> For additional commands, e-mail: java-dev-help@axis.apache.org
>> >>
>> >
>> >
>> >
>> > --
>> > Thanks,
>> > Anil Atyam,
>> >
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@axis.apache.org
>> For additional commands, e-mail: java-dev-help@axis.apache.org
>>
>
>
>
> --
> Thanks,
> Anil Atyam,
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@axis.apache.org
For additional commands, e-mail: java-dev-help@axis.apache.org


Mime
View raw message