chemistry-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Florian Müller <f...@apache.org>
Subject Re: dotCMIS sendChunked very slow
Date Thu, 12 Nov 2015 12:23:21 GMT
Hi AJ,

Thanks for your investigations!
I will look into it and have created a JIRA issue [1] to track it.

- Florian



[1] https://issues.apache.org/jira/browse/CMIS-955



> OK, also editing atompub-writer.cs, AtomEntryWriter.Write() method to
> use a 64k buffered output stream helps that scenario (the normal,
> CreateDocument scenario).
> 
>             using (BufferedStream bs = new BufferedStream(outStream, 64 
> * 1024))
>             {
>                 using (XmlWriter writer = XmlWriter.Create(bs,
> xmlWriterSettings))
>                 {
>                     // start doc
>                     ...
> 
> We are running some additional tests, but adding these (increased)
> buffers seems to dramatically increase performance in all of our test
> cases so far.
> 
> Please do let me know if I can provide any further details for the
> benefit of DotCMIS or PortCMIS!
> 
> -AJ
> 
> 
> 
> On 11/11/2015 2:34 PM, AJ Weber wrote:
>> So that edit works well with the empty-document-then-set-content 
>> method.  It does not change the behavior of trying to send the 
>> ContentStream as part of the CreateDocument method. :(
>> 
>> Wireshark consistently shows the chunk size when using the "one step 
>> create document" as 6143 bytes.  I'm trying to figure out how to 
>> increase that chunk size, but haven't found it yet.  Can you think of 
>> where I would increase the chunk size/buffer being streamed to the 
>> HttpRequest?
>> 
>> Thanks again,
>> AJ
>> 
>> On 11/11/2015 2:20 PM, AJ Weber wrote:
>>> Thank you for the quick reply! :)
>>> 
>>> We monitored the client-memory and found the same thing as your first 
>>> comment.  It's not scalable.
>>> 
>>> We tried wrapping the ContentStream's Stream (which was a FileStream) 
>>> in a BufferedStream and set the buffer to the same size as the 
>>> AtomPub Writer's (64k).  This had no effect.
>>> 
>>> We then tried the option of creating an empty document and 
>>> following-that with a SetContentStream().  I was shocked, but this 
>>> didn't change the speed either.  (I expected the removal of the 
>>> Base64 encoding to at least help a little.)
>>> 
>>> The last option isn't an option for us at this time, unfortunately.
>>> 
>>> ******
>>> Ah, we made one final change that appears to have helped!  In 
>>> atompub.cs, I made the following edit:
>>> 
>>>             HttpUtils.Output output = delegate(Stream stream)
>>>             {
>>>                 /*
>>>                 int b;
>>>                 byte[] buffer = new byte[4096];
>>>                 while ((b = contentStream.Stream.Read(buffer, 0, 
>>> buffer.Length)) > 0)
>>>                 {
>>>                     stream.Write(buffer, 0, b);
>>>                 }
>>>                 */
>>> 
>>>                 //AJW: .Net 4 copystream to simplify and optimize 
>>> (hopefully?)
>>>                 contentStream.Stream.CopyTo(stream, 64 * 1024);
>>> 
>>>                 contentStream.Stream.Close();
>>>             };
>>> 
>>> As I said, this seems to have finally increased the chunk-size going 
>>> across the wire (according to wireshark) and has sped-up our uploads. 
>>>  I will confirm with some additional, high-level metrics.
>>> 
>>> Thanks,
>>> AJ
>>> 
>>> 
>>> On 11/11/2015 12:25 PM, Florian Müller wrote:
>>>> Hi AJ,
>>>> 
>>>> The whole content is buffered on the client in memory before sending 
>>>> if you disable chunking. That works well for small documents but may 
>>>> fail for large ones. You can run out of memory.
>>>> Chunking adds some overhead but not that much.
>>>> 
>>>> Have you tried wrapping the content stream that you provide to 
>>>> DotCMIS into a BufferedStream (try big buffer sizes)? That may 
>>>> reduce the number of chunks and increase the performance.
>>>> 
>>>> If your content is really big, you may want to consider creating an 
>>>> empty document first and then upload the content afterwards with 
>>>> SetContentStream. When the content is sent with CreateDocument() it 
>>>> must be Base64 encoded. SetContentStream() can send the plain bytes. 
>>>> That can make a big performance difference.
>>>> 
>>>> If you server also understands the CMIS Browser Binding, you can try 
>>>> PortCMIS [1]. There is no official release yet, but the Browser 
>>>> Binding implementation is working and faster than the AtomPub 
>>>> implementation of DotCMIS.
>>>> 
>>>> 
>>>> - Florian
>>>> 
>>>> 
>>>> [1] https://chemistry.apache.org/dotnet/portcmis.html
>>>> 
>>>> 
>>>> 
>>>>> We have noticed some significant performance problems storing large
>>>>> content to our repository.
>>>>> 
>>>>> After wading through some of the dotCMIS code we tested setting the
>>>>> HttpWebRequest's sendChunked = false (as counter intuitive as it 
>>>>> may
>>>>> seem).
>>>>> 
>>>>> Setting this to false increased content transfer upload speeds by
>>>>> roughly 300% in all tests we have tried!
>>>>> 
>>>>> We are struggling with this scenario now.  It seems entirely 
>>>>> backwards
>>>>> to disable chunked encoding, but we can't argue with the 
>>>>> performance
>>>>> numbers.  We are testing against a Tomcat 7 based CMIS
>>>>> provider/service (using the HTTP 1.1 NIO connector). Inserting
>>>>> fiddler's proxy seems to automatically revert the upload to HTTP 
>>>>> 1.0
>>>>> standard - no chunking, so it's showing us the faster performance.
>>>>> Wireshark we are trying to understand better.
>>>>> 
>>>>> Just wondering if anyone else has tested this or has any different
>>>>> results on different CMIS providers?
>>>>> 
>>>>> Are there any known tweaks on the Tomcat side to better facilitate 
>>>>> the
>>>>> chunked transfer uploads?
>>>>> 
>>>>> What are the impacts of disabling chunking at the dotCMIS client 
>>>>> side?
>>>>> 
>>>>> Thanks for any insight.
>>>>> 
>>>>> -AJ
>>>> 
>>> 
>> 


Mime
View raw message