hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Osipov <micha...@apache.org>
Subject Re: Cannot saturate LAN connection with HttpClient
Date Sun, 24 May 2015 11:02:40 GMT
Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
> On Sun, 2015-05-24 at 00:29 +0200, Michael Osipov wrote:
>> Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
>>> On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
>>>> Hi,
>>>>
>>>> we are experiencing a (slight) performance problem with HttpClient 4.4.1
>>>> while downloading big files from a remote server in the corporate intranet.
>>>>
>>>> A simple test client:
>>>> HttpClientBuilder builder = HttpClientBuilder.create();
>>>> try (CloseableHttpClient client = builder.build()) {
>>>>      HttpGet get = new HttpGet("...");
>>>>      long start = System.nanoTime();
>>>>      HttpResponse response = client.execute(get);
>>>>      HttpEntity entity = response.getEntity();
>>>>
>>>>      File file = File.createTempFile("prefix", null);
>>>>      OutputStream os = new FileOutputStream(file);
>>>>      entity.writeTo(os);
>>>>      long stop = System.nanoTime();
>>>>      long contentLength = file.length();
>>>>
>>>>      long diff = stop - start;
>>>>      System.out.printf("Duration: %d ms%n",
>>>> TimeUnit.NANOSECONDS.toMillis(diff));
>>>>      System.out.printf("Size: %d%n", contentLength);
>>>>
>>>>      float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
>>>>
>>>>      System.out.printf("Speed: %.2f MB/s%n", speed);
>>>> }
>>>>
>>>> After at least 10 repetions I see that the 182 MB file is download
>>>> within 24 000 ms with about 8 MB/s max. I cannot top that.
>>>>
>>>> I have tried this over and over again with curl and see that curl is
>>>> able to saturate the entire LAN connection (100 Mbit/s).
>>>>
>>>> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
>>>>
>>>> Any idea what the bottleneck might me?
>>
>> Thanks for the quick response.
>>
>>> (1) Curl should be using zero copy file transfer which Java blocking i/o
>>> does not support. HttpAsyncClient on the other hand supports zero copy
>>> file transfer and generally tends to perform better when writing content
>>> out directly to the disk.
>>
>> I did try this [1] example and my heap exploaded. After increasing it to
>> -Xmx1024M, it did saturate the entire connection.
>>
>
> This sounds wrong. The example below does not use zero copy (with zero
> copy there should be no heap memory allocation at all).
>
> This example demonstrates how to use zero copy file transfer
>
> http://hc.apache.org/httpcomponents-asyncclient-4.1.x/httpasyncclient/examples/org/apache/http/examples/nio/client/ZeroCopyHttpExchange.java

I have seen this example but there is no ZeroCopyGet. I haven't found 
any example which explicitly says use zero-copy for GETs. The example 
from [1] did work but with the explosion. What did I wrong here.

>>> (2) Use larger socket / intermediate buffers. Default buffer size used
>>> by Entity implementations is most likely suboptimal.
>>
>> That did not make any difference. I have changed:
>>
>> 1. Socket receive size
>> 2. Employed a buffered input stream
>> 3. Manually copied the stream to a file
>>
>> I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
>> Regardless of this, your tip with zero copy helped me a lot.
>>
>> Unfortunately, this is just a little piece in a performance degregation
>> chain a colleague has figured out. HttpClient acts as an intermediate in
>> a webapp which receives a request via REST from a client, processes that
>> and opens up the stream to the huge files from a remote server. Without
>> caching the files to disk, I am passing the Entity#getContent stream
>> back to the client. The degreation is about 75 %.
>>
>> After rethinking your tips, I just checked the servers I am pulling off
>> data. One is slow the otherone is fast. Transfer speeds with piping the
>> streams from the fast server remains at 8 MB/s which is what I wanted
>> after I have identified an issue with my custom HttpResponseInputStream.
>>
>> I modified my code to use the async client and it seems to pipe with
>> maximum LAN speed though it looks weird with curl now. Curl blocks for
>> 15 seconds and within a second the entire stream is written down to disk.
>>
>
> It all sounds very bizarre. I see no reason why HttpAsyncClient without
> zero copy transfer should do any better than HttpClient in this
> scenario.

So you are saying something is probably wrong with my client setup?

> These are micro-benchmark workers that I use to compare relative
> performance of the clients
>
> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpClient.java
> http://svn.apache.org/repos/asf/httpcomponents/benchmark/httpclient/trunk/src/main/java/org/apache/http/client/benchmark/ApacheHttpAsyncClient.java
>
> Could you please run them against your test URI?

Yes, I will run them. I can only run the GET requests for now. In order 
to run them I have to modify the code and add custom HTTP headers which 
the ConfigParser does not provide at the moment.

I'll get back to you.

Michael

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Mime
View raw message