cxf-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mustafa Sezgin" <msez...@aconex.com>
Subject RE: CXF and large XML request/responses : streaming support?
Date Tue, 05 Jan 2010 00:38:09 GMT
Chunking could be a solution, as long as a write to the
CacheAndWriteOutputStream (which in turn causes a write to the Http end
point) causes a chunk to be sent. I will look into configuring jetty to
use chunking, you say this can be done via CXF, any documentation on this?
Hopefully Dan or Eoghan can provide some answers :)

Thanks

-----Original Message-----
From: Sergey Beryozkin [mailto:sberyozk@progress.com] 
Sent: Monday, 4 January 2010 10:22 PM
To: users@cxf.apache.org
Subject: RE: CXF and large XML request/responses : streaming support?


Hi Mustafa

Happy New Year to you too :-)
Dan should be online today so I'm hoping he will clarify, I'm off today
but will be online for a few days later this week.
I'm just wondering is it something the underlying container can be
configured to do, to stream back the data immediately after the
CacheAndWriteOutputStream has been given some data through its
httpresponse-connected stream ?

Is it really the HTTP chunking that we are after here ? CXF can be used to
configure jetty to do it and Tomcat should be configurable as well.
Hope Dan or Eoghan can help here

cheers, Sergey    
 
Hi Sergey,

Happy new year and all :)
I don't see how returning a StreamingOutput object will help in our 
instance. We do that at the moment for sending binary files back to the 
client however these files are already on the disk. Our problem mainly 
revolves around the fact that when we have a large number of objects in 
memory to marshall, the marshalling itself uses up more memory resulting
in 
a constant barrage of major GC's. What we would like to do is essentially
be 
able to stream back the marshalling as it occurs. So ideally, somehow 
configure JAXB so that as it performs marshalling of an object it sends
the 
XML produced and then continues on with the next object to marshall. This 
would relieve some of the memory pressure currently being produced on our 
app servers.

I think the ideal solution would be something where we can configure the
CXF 
runtime to stream back responses as the marshalling is occurring rather
than 
start the response streaming after all of the objects have been
marshalled. 
This functionality would not be required for all methods, only some
specific 
ones, so returning the to-be marshalled object in a wrapper object could 
possibly also help us in applying this functionality to only a subset of
our 
service methods rather than all..

I think another option may be to use some sort of outputstream which
writes 
the marshalled XML to disk and then sends that back down the wire once the

marshalling is complete thus not putting any pressure on memory.

Now having said that, I have done some testing. It sort of turns out that 
CXF may already be doing what we want (thus the memory pressure may be 
caused by something in our app and not CXF/marshalling). To confirm this, 
hopefully you can answer a few questions Sergey.

It sort of seems that when a request is being processed by the outbound 
interceptors, a CacheAndWriteOutputStream is used. This seems to have two 
output streams it writes to. One which is the http end point 
(AbstractHttpDestination.WrappedOutputstream) and the other which is an 
internal output stream initially being a memory based buffer but then
being 
converted into a file output stream which gets created if the amount of
data 
being written is over a certain threshold. This is good. I have verified 
that if a large amount of data is written the temp file is created and the

rest of the generated xml is written there. My question remains though as
to 
what happens when the write occurs on the http end point? As XML is being 
generated and CacheAndWriteOutputStream.write is called (which calls 
flowThroughStream.write with flowThroughStream being and instance of 
AbstractHttpDestination.WrappedOutputstream) does this actually get sent 
down the wire? It seems that I only get data visible on the client end
(via 
a browser) when the MessageSenderEndingInterceptor closes the outputstream

(CacheAndWriteOutputStream) which in turn does the flush and close on the 
AbstractHttpDestination.WrappedOutputstream & file output stream..

Is this analysis correct Sergey? Or have i missed a vital bit of info 
somewhere? BTW this is all done with the enableStreaming = false so I have

not registered my own JaxBElementProvider...

Thanks

Mustafa



-----Original Message-----
From: Sergey Beryozkin [mailto:sberyozk@progress.com]
Sent: Thursday, 24 December 2009 3:30 AM
To: users@cxf.apache.org
Subject: Re: CXF and large XML request/responses : streaming support?

Hi

>I have the need to stream large XML responses back to the client using
> Jax-RS. We have a large number of objects (Potentially upto a million) 
> which
> need to be marshalled and the response returned, is the support for
> streaming XML responses while objects are being marshalled in CXF at the
> moment? We are currently seeing some large degradation in performance at
> times when these large number of entities are being marshalled.
>
> I basically return the objects which need to be marshalled from our 
> Service
> methods, what would need to change for me to be able to make use of the
> streaming support?

I can think of few options. I do believe the CXF runtime has all what is 
needed to do the effective streaming back to the client but
I will need to ask Dan for some clarifications, some updates might need to

be applied to CXF JAXRS.

As far as JAXRS itself is concerned, you might want to choose to return an

instance of StreamingOutput from a method. Or JAXP Source
and actually return an instance of CXF StaxSource.
If it is JAXB that you use then you may want to try explicitly registering

JAXBElementProvider and setting an "enableStreaming"
boolean property on it in which case JAXBProvider will create an 
XMLStreamWriter and pass it to JAXB Marshaller. This option looks
similar to explicitly returning an instance of StaxSource.

Another option is to return a multipart formatted response, please see :

http://cwiki.apache.org/CXF20DOC/jax-rs.html#JAX-RS-Writingattachments

Another option which might be worth evaluating is to return a list of
links 
back to a client (embedded in some minimal custom XML
instance) so that a client can fetch data from different links in parallel

which might improve the overall experience...Similar
option is to update the interface for it to support the pagination...

 Let me know please what do you think is the best option for your project 
and then we can focus on ensuring that option is supported
well by CXF JAXRS

thanks, Sergey



>
> -----Original Message-----
> From: Sergey Beryozkin [mailto:sberyozk@progress.com]
> Sent: Friday, 9 October 2009 11:12 PM
> To: users@cxf.apache.org
> Cc: rsmith
> Subject: Re: CXF and large XML request/responses : streaming support?
>
> Hi
>
>> It is interesting, especially the Stax support.I'm not familiar with
the
>> recent build of CXF, on this matter would it be also available for the
>> JAX-RS support.
>
> I missed it...I think in the case of JAXRS declaring a method accepting
> (JAXP) Source will work once
> I update a SourceProvider to check if XMLStreamReader is available on
the
> message (or create a new one if it is a multipart request)
> and then wrap it in StaxSource and just pass it on - will be done for
2.3;
> if you need it working now then I can help you with
> creating a custom SourceProvider...The existing MultipartProvider will 
> just
> delegate to it.
>
> thanks, Sergey
>
>>
>> Anyway great framework :)
>>
>>
>> On Thu, Oct 8, 2009 at 19:29, Daniel Kulp <dkulp@apache.org> wrote:
>>
>>>
>>> Right now, with a JAX-WS provider, there is SOME support for this, but
>>> its
>>> far
>>> from ideal.   This is an area I'll be working in next week (resolving
>>> customer
>>> issues) and I'll see if I can add some enhancements easily enough.
>>>
>>> Basically, right now, if you do Provider<Source>, you would get
>>> DOMSource
>>> in
>>> (thus, the incoming message would not be streamed, but you could
return 
>>> a
>>> StreamSource or SAXSource orsimilar to use that we would use to copy
>>> stuff
>>> out.    If you did Provider<StreamSource> or Provider<SAXSource>,
we 
>>> pull
>>> the
>>> full message into a Cached stream (which, for large messages, would
>>> output
>>> to
>>> temp files on disks) and return that to you.   Thus, the whole thing
>>> isn't
>>> in
>>> memory, but it does result in the temp files and such.
>>>
>>> Part of what I hope to do next week is enable:
>>> Provider<XMLStreamReader>
>>> and/or
>>> Provider<StaxSource>
>>> which would allow full streaming in most cases.
>>>
>>> Dan
>>>
>>>
>>>
>>> On Wed October 7 2009 12:37:50 am rsmith wrote:
>>> > I'm trying to find out if CXF supports full streaming of input and
>>> > output
>>> > messages for the SOAP transport.
>>> >
>>> > I have a service that will be receiving large input XML payload, and
>>> > will
>>> >  be generating a response with a large XML payload.  I can process
the
>>> >  input XML incrementally, generating the response as the input is
>>> >  processed.
>>> >
>>> > Is there a way to implement a service in CXF streaming at all levels
>>> > (XML
>>> > parsing, data binding, generating response), avoiding holding the
full
>>> > document in memory at any time?
>>> >
>>> > I found several threads on the mailing list, some of which make it
>>> > sound
>>> > like it's supported.  This message gave me the impression it may not
>>> > currently be supported though:
>>> >
>>>
http://www.nabble.com/Re%3A-Configuring-streaming-web-services%3A-error-on
-
>>> > the-call-to-invoke-p24187339.html
>>> >
>>> > Some of the other threads:
>>> >
>>>
http://www.nabble.com/Looking-for-a-solution-for-Large-XML-Messages---stre
a
>>> > ming-and-JAXWS-td20451942.html#a20451942
>>> >
>>>
http://www.nabble.com/Recommended-way-to-have-a-web-method-stream-results-
>>> > back-to-client--td22856243.html#a22864087
>>> >  http://www.nabble.com/SAXSource-td24411461.html#a24411461
>>> >
>>> > Thanks in advance
>>> >
>>>
>>> --
>>> Daniel Kulp
>>> dkulp@apache.org
>>> http://www.dankulp.com/blog
>>>
>>
>>
>>
>> -- 
>> Bryce
>>
>



Mime
View raw message