hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Kalnichevski <ol...@apache.org>
Subject Re: Cannot saturate LAN connection with HttpClient
Date Sun, 24 May 2015 10:17:52 GMT
On Sun, 2015-05-24 at 00:29 +0200, Michael Osipov wrote:
> Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
> > On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
> >> Hi,
> >>
> >> we are experiencing a (slight) performance problem with HttpClient 4.4.1
> >> while downloading big files from a remote server in the corporate intranet.
> >>
> >> A simple test client:
> >> HttpClientBuilder builder = HttpClientBuilder.create();
> >> try (CloseableHttpClient client = builder.build()) {
> >>     HttpGet get = new HttpGet("...");
> >>     long start = System.nanoTime();
> >>     HttpResponse response = client.execute(get);
> >>     HttpEntity entity = response.getEntity();
> >>
> >>     File file = File.createTempFile("prefix", null);
> >>     OutputStream os = new FileOutputStream(file);
> >>     entity.writeTo(os);
> >>     long stop = System.nanoTime();
> >>     long contentLength = file.length();
> >>
> >>     long diff = stop - start;
> >>     System.out.printf("Duration: %d ms%n",
> >> TimeUnit.NANOSECONDS.toMillis(diff));
> >>     System.out.printf("Size: %d%n", contentLength);
> >>
> >>     float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
> >>
> >>     System.out.printf("Speed: %.2f MB/s%n", speed);
> >> }
> >>
> >> After at least 10 repetions I see that the 182 MB file is download
> >> within 24 000 ms with about 8 MB/s max. I cannot top that.
> >>
> >> I have tried this over and over again with curl and see that curl is
> >> able to saturate the entire LAN connection (100 Mbit/s).
> >>
> >> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
> >>
> >> Any idea what the bottleneck might me?
> 
> Thanks for the quick response.
> 
> > (1) Curl should be using zero copy file transfer which Java blocking i/o
> > does not support. HttpAsyncClient on the other hand supports zero copy
> > file transfer and generally tends to perform better when writing content
> > out directly to the disk.
> 
> I did try this [1] example and my heap exploaded. After increasing it to 
> -Xmx1024M, it did saturate the entire connection.
> 

This sounds wrong. The example below does not use zero copy (with zero
copy there should be no heap memory allocation at all). 

This example demonstrates how to use zero copy file transfer

http://hc.apache.org/httpcomponents-asyncclient-4.1.x/httpasyncclient/examples/org/apache/http/examples/nio/client/ZeroCopyHttpExchange.java

> > (2) Use larger socket / intermediate buffers. Default buffer size used
> > by Entity implementations is most likely suboptimal.
> 
> That did not make any difference. I have changed:
> 
> 1. Socket receive size
> 2. Employed a buffered input stream
> 3. Manually copied the stream to a file
> 
> I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
> Regardless of this, your tip with zero copy helped me a lot.
> 
> Unfortunately, this is just a little piece in a performance degregation 
> chain a colleague has figured out. HttpClient acts as an intermediate in 
> a webapp which receives a request via REST from a client, processes that 
> and opens up the stream to the huge files from a remote server. Without 
> caching the files to disk, I am passing the Entity#getContent stream 
> back to the client. The degreation is about 75 %.
> 
> After rethinking your tips, I just checked the servers I am pulling off 
> data. One is slow the otherone is fast. Transfer speeds with piping the 
> streams from the fast server remains at 8 MB/s which is what I wanted 
> after I have identified an issue with my custom HttpResponseInputStream.
> 
> I modified my code to use the async client and it seems to pipe with 
> maximum LAN speed though it looks weird with curl now. Curl blocks for 
> 15 seconds and within a second the entire stream is written down to disk.
> 

It all sounds very bizarre. I see no reason why HttpAsyncClient without
zero copy transfer should do any better than HttpClient in this
scenario.

These are micro-benchmark workers that I use to compare relative
performance of the clients
 
http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpClient.java
http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpAsyncClient.java
  

Could you please run them against your test URI?

Oleg 


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Mime
View raw message