cxf-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Beryozkin <sberyoz...@gmail.com>
Subject Re: Multipart values are not trimed
Date Tue, 06 Nov 2012 15:38:16 GMT
Hi
>>>>
>>>>   Hi,
>>>>>
>>>>> To get a bigger picture let me explain what I would like to actually
>>>>> craft :
>>>>>
>>>>> In a multipart POST request, I'd like to have form params and a file
>>>>> attachement (like the example above). And I would like to handle myself
>>>>> the
>>>>> inputstream of the file. In order do stuff like
>>>>>     - checking some headers, for example Content-Length on one of the
>>>>> Attachement, Content-Disposition etc
>>>>>     - consuming the content of the given inputstream of this part to
>>>>> store
>>>>> it
>>>>> in a temporary file
>>>>>
>>>>> However in the MessageBodyReader, the entityStream looks like it's been
>>>>> closed and already consumed. Debugging reveals that an
>>>>> AttatchmentDeserializer already consumed the stream, and created an
>>>>> Attachement collection, however my provider wasn't called at that time.
>>>>> If
>>>>> the opportunity is available I would like to copy these bytes to another
>>>>> outputstream.
>>>>>
>>>>>    The provider for TemporaryBinaryFile is called later, when individual
>>>>>
>>>> parts are deserialized.
>>>>
>>>>
>>>>    Is it possible or should I use attachments ? I'd like as much as
>>>> possible
>>>>
>>>>> avoid technical code in the resource, and have a reference to a
>>>>>     TemporaryBinaryFile.
>>>>>
>>>>>
>>>>>   You can use org.apache.cxf.jaxrs.ext.****multipart.Attachment instead
>>>> of
>>>>
>>>> TemporaryBinaryFile, check Content-Type and Content-Disposition, and then
>>>> do 'attachment.getObject(****TemporaryBinaryFile.class)':
>>>>
>>>>
>>>> post(@Multipart("someid") Attachment attachment) {
>>>>      attachment.getContentType();
>>>>      attachment.****getContentDisposition();
>>>>      attachment.getObject(****TemporaryBinaryFile.class)
>>>>
>>>> }
>>>>
>>>> Actually, you can optimize it slightly by adding a 'type' parameter to
>>>> @Multipart(value = "someid", type = "text/plain")
>>>>
>>>>
>>> Ok, thx for that :)
>>> Do you think it will be possible to stream directly the content of the
>>> attachment to another outputstream ? The attachment can have a large size
>>> like 20 MB maybe more, I'd like to keep memory consumption as low as
>>> possible.
>>>
>>>   CXF will internally manage saving the stream to the temp folder if the
>> part is large.
>>
>> You can do
>>
>> attachment.getObject(**InputStream.class),
>>
>> in which case you will have to deal with InputStream directly or you can
>> do it within your own TemporaryBinaryFile MBR when you do
>>
>> attachment.getObject(**TemporaryBinaryFile.class)
>>
>
> Fantastic :)
> I would have preferred to have a avoid dealing with technical code in
> direct way, so I will probably keep a reference to the inputStream in a
> renamed StreamableBinaryFile.
>
> Is it possible to have the size of the attachment in a safer way than this
> (if the Content-Length isn't present) ?
>
> ((AttachmentDataSource)
> attachment.getDataHandler().getDataSource()).cache.size()
>
> Note that the cache field would be accessed via reflexion.
>

I think the better option, assuming you'd like to enforce a certain 
limit, is to use attachment-max-size property:

http://cxf.apache.org/docs/security.html#Security-Multiparts

>>>>>>
>>>>>>>
>>>>>>> I'm crafting a resource that should accept multipart POST request.
>>>>>>>
>>>>>>> Here's the method :
>>>>>>>
>>>>>>> ==============================******==================
>>>>>>>       @POST
>>>>>>>       @Produces({MediaType.******APPLICATION_JSON})
>>>>>>>       @Consumes(MediaType.MULTIPART_******FORM_DATA)
>>>>>>>
>>>>>>>
>>>>>>>       public MetaData archive(@FormParam("title") String title,
>>>>>>>                                       @FormParam("revision")
String
>>>>>>> revision,
>>>>>>>                                       @Multipart("archive")
>>>>>>> TemporaryBinaryFile
>>>>>>> temporaryBinaryFile) {
>>>>>>> ==============================******==================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Also I tried with @Multipart instead of @FormParam
>>>>>>>
>>>>>>> ==============================******==================
>>>>>>>       @POST
>>>>>>>       @Produces({MediaType.******APPLICATION_JSON})
>>>>>>>       @Consumes(MediaType.MULTIPART_******FORM_DATA)
>>>>>>>
>>>>>>>
>>>>>>>       public DocumentMetaData archive(@Multipart(value = "title",
>>>>>>> required =
>>>>>>> false) @FormParam("title") String title,
>>>>>>>                                       @Multipart(value = "revision",
>>>>>>> required =
>>>>>>> false) String revision,
>>>>>>>                                       @Multipart("archive")
>>>>>>> TemporaryBinaryFile
>>>>>>> temporaryBinaryFile) {
>>>>>>>
>>>>>>>
>>>>>>>   You have @FormParam and @Multipart attached to 'title', drop
>>>>>> @FormParam,
>>>>>> I
>>>>>> think it only works because 'title' is a simple parameter.
>>>>>>
>>>>>>
>>>>>>
>>>>>>   Yes I wrongly copied/ modified the code in the mail, however I
tested
>>>>> both
>>>>> setup separately.
>>>>> Anyway, as you advised me I will inly use Multipart now.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>      ==============================******==================
>>>>>>
>>>>>>
>>>>>>> And here is the raw request :
>>>>>>> ==============================******==================
>>>>>>> Address: http://localhost:8080/api/v1.******0/document/archive<http://localhost:8080/api/v1.****0/document/archive>
>>>>>>> <http://**localhost:8080/api/v1.**0/**document/archive<http://localhost:8080/api/v1.**0/document/archive>
>>>>>>>>
>>>>>>> <http://**localhost:8080/api/**v1.0/**document/archive<http:/**
>>>>>>> /localhost:8080/api/v1.0/**document/archive<http://localhost:8080/api/v1.0/document/archive>
>>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>   Encoding: ISO-8859-1
>>>>>>> Http-Method: POST
>>>>>>> Content-Type: multipart/form-data;boundary=******partie
>>>>>>>
>>>>>>> Headers: {Accept=[*/*], accept-charset=[ISO-8859-1,**
>>>>>>> utf-8;q=0.7,*;q=0.3],
>>>>>>> accept-encoding=[gzip,deflate,******sdch], Content-Length=[301],
>>>>>>> content-type=[multipart/form-******data;boundary=partie]}
>>>>>>>
>>>>>>>
>>>>>>> Payload:
>>>>>>> --partie
>>>>>>> Content-Disposition: form-data; name="title"
>>>>>>> Content-ID: title
>>>>>>>
>>>>>>> the.title
>>>>>>> --partie
>>>>>>> Content-Disposition: form-data; name="revision"
>>>>>>> Content-ID: revision
>>>>>>>
>>>>>>> some.revision
>>>>>>> --partie
>>>>>>> Content-Disposition: form-data; name="archive"; filename="file.txt"
>>>>>>> Content-Type: text/plain
>>>>>>>
>>>>>>> I've got a woman, way over town...
>>>>>>> --partie
>>>>>>> ==============================******==================
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> However the title and revision values are incorrect because they
are
>>>>>>> ended
>>>>>>> by a new line char '\n'. Hence these parameters are not validated
by
>>>>>>> my
>>>>>>> validator (which is using Message.getContent),
>>>>>>>
>>>>>>> I don't think this is a normal behavior, but I might be wrong,
maybe
>>>>>>> about
>>>>>>> the specs, or my request. Note that I had to add the Content-ID
when
>>>>>>> using
>>>>>>> the Multipart annotation.
>>>>>>>
>>>>>>>
>>>>>>>   What CXF version is it ? Content-Disposition 'name' is definitely
>>>>>> checked
>>>>>> too.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>> Also I found part of the code that should check the Content-Disposition,
>>>>> however I have found that the first letter 'C' disappeared and the key
>>>>> in
>>>>> the attachment header is now 'ontent-Disposition' which can complicate
>>>>> things further, and probably explains why, I needed a Content-ID header
>>>>> in
>>>>> each part. Although the first part got his header Content-Disposition
>>>>> always correctly decoded. Adding another new line after the boundary
>>>>> fixes
>>>>> looks like a workaround though, but i'd rather not impose this on the
>>>>> API
>>>>> users :/
>>>>>
>>>>> I couldn't figure out yet where the code could is consuming the
>>>>> additional
>>>>> char. I just know that at some point, the LazyAttachmentCollection has
>>>>> the
>>>>> remaining attachment (AttachmentImpl), and the first header is wrong.
>>>>>
>>>>>
>>>>>   I think it is the bug of the code the posts the multipart, I recall
>>>> exactly the same issue reported when RESTClient was used
>>>>
>>>>
>>> Isn't it this issue ? https://issues.apache.org/**jira/browse/CXF-2704<https://issues.apache.org/jira/browse/CXF-2704>
>>>
>>
>> Looks like so, but I also do recall the same issue with RESTClient payloads
>>
>>
>>>
>>>
>>>>   About Content-Disposition name, it is checked only if there is no
>>>>> Content-ID, however it seems at some point the default Content-ID is
>>>>> added "
>>>>> root.message@cxf.apache.org", which defeats the purpose of the
>>>>> following
>>>>> code.
>>>>>
>>>>>        private static boolean *matchAttachmentId(Attachment at, Multipart
>>>>> mid,
>>>>> MediaType multipartType)* {
>>>>>            if (at.getContentId().equals(mid.****value())) {
>>>>>
>>>>>                return true;
>>>>>            }
>>>>>            ContentDisposition cd = at.getContentDisposition();
>>>>>            if (cd != null&&    mid.value().equals(cd.****
>>>>> getParameter("name")))
>>>>>
>>>>> {
>>>>>                return true;
>>>>>            }
>>>>>            return false;
>>>>>        }
>>>>>
>>>>>    default Content-ID is added on the output, it is not added during
the
>>>>>
>>>> read...
>>>>
>>>>
>>> I'm not 100% sure how everything worked, but at some point the
>>> MultipartProvider.readFrom is called from the
>>> JAXRSUtils.**readFromMessageBodyReader, which will indirectly call the
>>> above
>>> code :
>>>
>>>       public Object *readFrom*(Class<Object>   c, Type t, Annotation[]
anns,
>>>
>>> MediaType mt,
>>>                              MultivaluedMap<String, String>   headers,
>>> InputStream is) throws IOException, WebApplicationException {
>>>
>>> // ...
>>>
>>>           Multipart id = AnnotationUtils.getAnnotation(**anns,
>>> Multipart.class);
>>>           Attachment multipart = *AttachmentUtils.getMultipart(**c, id,
>>> mt,
>>> infos)*;
>>>
>>>           if (multipart != null) {
>>>               return fromAttachment(multipart, c, t, anns);
>>>           } else if (id != null&&   !id.required()) {
>>>
>>>
>>> // ...
>>>
>>>       }
>>>
>>>
>>>
>>>       public static Attachment getMultipart(Class<Object>   c,
>>>                                             Multipart id,
>>>                                             MediaType mt,
>>>                                             List<Attachment>   infos) throws
>>> IOException {
>>>
>>>           if (id != null) {
>>>               for (Attachment a : infos) {
>>>                   if (*matchAttachmentId(a, id, mt)*) {
>>>
>>>                       checkMediaTypes(a.**getContentType(), id.type());
>>>                       return a;
>>>                   }
>>>               }
>>> // ...
>>>       }
>>>
>>> I'm not sure of the implications, but it might be possible to fix this
>>> with
>>> the following code :
>>>
>>>       private static boolean matchAttachmentId(Attachment at, Multipart
>>> mid,
>>> MediaType multipartType) {
>>>           ContentDisposition cd = at.getContentDisposition();
>>>           boolean matchContentDispositionName = cd != null&&
>>> mid.value().equals(cd.**getParameter("name"));
>>>           boolean matchContentId = at.getContentId().equals(mid.**
>>> value());
>>>
>>>           return matchContentId || matchContentDispositionName;
>>>       }
>>>
>>>
>> What exactly you are proposing to fix though ?
>>
>
> Damn, forgive me I stayed too long at work yesterday night and missed
> things, that affected my mail this morning as well it seems ! I was
> mistaken by the fact that the fist letter of the first header in the second
> and following attachment are missing, hence in my case Content-Disposition
> isn't parsed by CXF.
>
> Anyway the above code works correctly. ....shame on me !
>
>
> Again thank very much, I owe you a beer or two !

No problems at all :-), thanks for stressing the code :-)

Cheers, Sergey

>
>   Cheers
> -- Brice
>


-- 
Sergey Beryozkin

Talend Community Coders
http://coders.talend.com/

Blog: http://sberyozkin.blogspot.com

Mime
View raw message