axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anil Atyam <aanilku...@gmail.com>
Subject Re: MIME Headers - Parsing Issues- AXIS2
Date Tue, 07 Jun 2011 19:32:33 GMT
Please see below formal definition: (Source RFC 2045)

The formal definition of these header fields is as follows:

     entity-headers := [ content CRLF ]
                       [ encoding CRLF ]
                       [ id CRLF ]
                       [ description CRLF ]
                       *( MIME-extension-field CRLF )

     MIME-message-headers := entity-headers
                             fields
                             version CRLF
                             ; The ordering of the header
                             ; fields implied by this BNF
                             ; definition should be ignored.

     MIME-part-headers := entity-headers
                          [ fields ]
                          ; *Any field not beginning with
                          ; "content-" can have no defined
                          ; meaning and may be ignored.
                          ; The ordering of the header
*                          ; fields implied by this BNF
                          ; definition should be ignored.

   The syntax of the various specific MIME header fields will be
   described in the following sections.

In Specification, no where it tells us that ";" is a valid continuation
character. CRLF terminates each entity-header as given entity-headers
syntax.

Please let me know if I can help further.

Thanks
Anil Kumar

On Fri, Jun 3, 2011 at 5:18 PM, Andreas Veithen
<andreas.veithen@gmail.com>wrote:

> On Fri, May 27, 2011 at 17:43, Anil Atyam <aanilkumar@gmail.com> wrote:
> >
> > Thanks Andreas.
> >
> > I wanted to propose the following after reading specifications from
> RFC2045
> > to RFC2049. I thought the proposed parsing algorithm supplements existing
> > parsing process. You being the programmer or AXIS2, you and your team
> have
> > highest authority to make a final call on this. This is a proposal only.
> >
> > Every header in MIME part must be terminated by <CR><LF>. In our ESB
MIME
> > case, the Content-Type header terminated with <CR><LF> but with semicolon
> at
> > the end.
> > As per my understanding (after reading spec) <CR><LF> is the key to
> identify
> > next header. This is what exactly done in readHeaders() method. There is
> a
> > else block which checks for the parameter ending ";" and considering the
> > next byte as part of the same header irrespective it is continued in the
> > same line or next line.
> >
> > What I would like to propose, because every header in MIME part always
> start
> > with "Content-", it may be a good idea to check if the next byte followed
> by
> > <CR><LF> starts with "Content-". If yes, consider it as a new header
not
> as
> > continuation to existing header especially if the continuation is
> terminated
> > by <CR><LF>.
> > In this case, If the first header terminates the line with semi-colon and
> > new line doesn't start with "Content-", consider it as continuation.
> > Otherwise, consider it as new header and process accordingly.
>
> What part of the specs suggests that an application should use this
> kind of parsing logic to process the MIME headers?
>
> >
> > I have found few example MIME messages where a content-type is terminated
> > with semi-colon: (In first example, content-transfer-encoding must have
> been
> > parsed as part of content-type and content-id is parsed seperately. ) I
> have
> > no idea which version of MIME spec they meant in those examples
> >
> > Example 1
> >
> http://publib.boulder.ibm.com/infocenter/wasinfo/v6r1/index.jsp?topic=/com.ibm.websphere.wsfep.multiplatform.doc/info/ae/ae/twbs_enablemtom.html
> >
> >  other transport headers ...
> > Content-Type: multipart/related;
> > boundary=MIMEBoundaryurn_uuid_0FE43E4D025F0BF3DC11582467646812;
> > type="application/xop+xml"; start="
> > <0.urn:uuid:0FE43E4D025F0BF3DC11582467646813@apache.org>";
> > start-info="text/xml"; charset=UTF-8
> >
> > --MIMEBoundaryurn_uuid_0FE43E4D025F0BF3DC11582467646812
> > content-type: application/xop+xml; charset=UTF-8; type="text/xml";
> > content-transfer-encoding: binary
> > content-id:
> >    <0.urn:uuid:0FE43E4D025F0BF3DC11582467646813@apache.org>
> >
> > <?xml version="1.0" encoding="UTF-8"?>
> >          <soapenv:Envelope
> > xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
> >             <soapenv:Header/>
> >             <soapenv:Body>
> >                <sendImage xmlns="
> http://org/apache/axis2/jaxws/sample/mtom">
> >                   <input>
> >                      <imageData>
> >                         <xop:Include
> > xmlns:xop="http://www.w3.org/2004/08/xop/include"
> > href="cid:1.urn:uuid:0FE43E4D025F0BF3DC11582467646811@apache.org"/>
> >                      </imageData>
> >                   </input>
> >                </sendImage>
> >             </soapenv:Body>
> >          </soapenv:Envelope>
> > --MIMEBoundaryurn_uuid_0FE43E4D025F0BF3DC11582467646812
> > content-type: text/plain
> > content-transfer-encoding: binary
> > content-id:
> >          <1.urn:uuid:0FE43E4D025F0BF3DC11582467646811@apache.org>
> >
> > … binary data goes here …
> > --MIMEBoundaryurn_uuid_0FE43E4D025F0BF3DC11582467646812--
> >
> > Example 2:
> >
> > http://msdn.microsoft.com/en-us/library/ms526560(v=exchg.10).aspx
> >
> >
> >
> > From: John Doe <example@example.com>
> > MIME-Version: 1.0
> > Content-Type: multipart/mixed;
> >         boundary="XXXXboundary text"
> >
> > This is a multipart message in MIME format.
> >
> > --XXXXboundary text
> > Content-Type: text/plain
> >
> > this is the body text
> >
> > --XXXXboundary text
> > Content-Type: text/plain;
> > Content-Disposition: attachment;
> >         filename="test.txt"
> >
> > this is the attachment text
> >
> > --XXXXboundary text--
> >
> > On Thu, May 26, 2011 at 4:00 PM, Andreas Veithen <
> andreas.veithen@gmail.com>
> > wrote:
> >>
> >> Strict interpretation of the specs would actually suggest that a
> >> semicolon at the end of the content type is not valid. I think that
> >> RFC 2045 applies here, which defines the syntax of the Content-Type
> >> header as follows (see section 5.1):
> >>
> >> content := "Content-Type" ":" type "/" subtype
> >>                *(";" parameter)
> >>
> >> This means that a semicolon must be followed by a parameter. Therefore
> >> I would say that the ESB is not entirely compliant with the MIME
> >> specification.
> >>
> >> On the other hand, the code in
> >> org.apache.axiom.attachments.impl.PartFactory interprets a semicolon
> >> at the end of a header as a continuation character. This code was
> >> actually introduced by AXIOM-257 [1] and went into Axiom 1.2.8 (Do you
> >> use an Axis2 version that ships with Axiom 1.2.8 or above?). What I
> >> don't see is which part of the specs actually defines the semicolon as
> >> a continuation character. Somebody else recently came up with an issue
> >> related to multi-line headers [2], and this may somehow be related.
> >>
> >> Andreas
> >>
> >> [1] https://issues.apache.org/jira/browse/AXIOM-257
> >> [2] http://markmail.org/thread/guj44ez5wdnxjobc
> >>
> >> On Thu, May 26, 2011 at 16:10, Anil Atyam <aanilkumar@gmail.com> wrote:
> >> > AXIS2 Committors-
> >> >
> >> >
> >> > It appears that there may be a bug in parsing MIME body parts. I have
> >> > downloaded the source code and included additional logging. Please see
> >> > the
> >> > below details.
> >> >
> >> > Brief History:
> >> > The first MIME header (Not Working) comes from ESB server which in
> turn
> >> > invoked a web service on IIS server (.net framework). JBOSS powered by
> >> > AXIS2
> >> > --> ESB --> IIS
> >> > Second MIME Header (Working) comes directly from IIS server. JBOSS
> >> > powered
> >> > by AXIS2 --> IIS (.net Framework)
> >> >
> >> > I will be glad If you can confirm by running the first MIME header
> using
> >> > the
> >> > tools you have to see if you can obtain the content type. There is NO
> >> > Commercial Product involved in this particular prototype scenario. I
> am
> >> > using AXIS2 downloaded from APACHE website along with source code.
> >> >
> >> > I do not know If this is a bug in AXIS2 Parser or the content type
> must
> >> > not
> >> > have been terminated by semi colon. Appreciate if you can comment
> >> > your expertise. You can very well ignore this question if it is out of
> >> > scope
> >> > from open source support.
> >> >
> >> >
> >> > NOT Working MIME Header (Extra logging given in getPart() Method) -
> >> > Please
> >> > notice content-type is ending with semi-colon. I think the parser
> >> > assuming
> >> > content-ID is part of content-type and hence part.getContentID()
> >> > returned
> >> > NULL (highlighted in RED)
> >> >
> >> >
> >> > [5/25/11 16:33:51:472 EDT] 00000029 Attachments   I   *** getPart :
> >> > return
> >> > part-> part.getContentID(): null part.getContentType():
> >> > application/xop+xml;charset=utf-8;type="text/xml";Content-ID:
> >> > <0.634419380311847897@example.org> part.toString():
> >> > org.apache.axiom.attachments.impl.PartOnMemoryEnhanced@443c443c
> >> >
> >> >
> >> >
> >> > ----MIMEBoundary634419380311847897
> >> >
> >> > Content-Type: application/xop+xml;type="text/xml";
> >> >
> >> > Content-ID: <0.634419380311847897@example.org>
> >> >
> >> > content-transfer-encoding: binary
> >> >
> >> >
> >> >
> >> > Working MIME Header (Extra logging given in getPart() Method)
> >> >
> >> >
> >> > [5/25/11 16:40:07:195 EDT] 00000029 Attachments   I   *** getPart :
> >> > return
> >> > part-> part.getContentID(): <0.634419384069456049@example.org>
> >> > part.getContentType(): application/xop+xml; charset=utf-8;
> >> > type="text/xml;
> >> > charset=utf-8" part.toString():
> >> > org.apache.axiom.attachments.impl.PartOnMemoryEnhanced@27f427f4
> >> >
> >> >
> >> >
> >> > ----MIMEBoundary634419384069456049
> >> >
> >> > content-id: <0.634419384069456049@example.org>
> >> >
> >> > content-type: application/xop+xml; charset=utf-8; type="text/xml;
> >> > charset=utf-8"
> >> >
> >> > content-transfer-encoding: binary
> >> >
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@axis.apache.org
> >> For additional commands, e-mail: java-dev-help@axis.apache.org
> >>
> >
> >
> >
> > --
> > Thanks,
> > Anil Atyam,
> >
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@axis.apache.org
> For additional commands, e-mail: java-dev-help@axis.apache.org
>
>


-- 
Thanks,
Anil Atyam,

Mime
View raw message