hc-httpclient-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Kalnichevski <ol...@apache.org>
Subject Re: Cannot saturate LAN connection with HttpClient
Date Sun, 24 May 2015 12:25:49 GMT
On Sun, 2015-05-24 at 13:02 +0200, Michael Osipov wrote:
> Am 2015-05-24 um 12:17 schrieb Oleg Kalnichevski:
> > On Sun, 2015-05-24 at 00:29 +0200, Michael Osipov wrote:
> >> Am 2015-05-23 um 22:29 schrieb Oleg Kalnichevski:
> >>> On Sat, 2015-05-23 at 22:09 +0200, Michael Osipov wrote:
> >>>> Hi,
> >>>>
> >>>> we are experiencing a (slight) performance problem with HttpClient 4.4.1
> >>>> while downloading big files from a remote server in the corporate intranet.
> >>>>
> >>>> A simple test client:
> >>>> HttpClientBuilder builder = HttpClientBuilder.create();
> >>>> try (CloseableHttpClient client = builder.build()) {
> >>>>      HttpGet get = new HttpGet("...");
> >>>>      long start = System.nanoTime();
> >>>>      HttpResponse response = client.execute(get);
> >>>>      HttpEntity entity = response.getEntity();
> >>>>
> >>>>      File file = File.createTempFile("prefix", null);
> >>>>      OutputStream os = new FileOutputStream(file);
> >>>>      entity.writeTo(os);
> >>>>      long stop = System.nanoTime();
> >>>>      long contentLength = file.length();
> >>>>
> >>>>      long diff = stop - start;
> >>>>      System.out.printf("Duration: %d ms%n",
> >>>> TimeUnit.NANOSECONDS.toMillis(diff));
> >>>>      System.out.printf("Size: %d%n", contentLength);
> >>>>
> >>>>      float speed = contentLength / (float) diff * (1_000_000_000 / 1_000_000);
> >>>>
> >>>>      System.out.printf("Speed: %.2f MB/s%n", speed);
> >>>> }
> >>>>
> >>>> After at least 10 repetions I see that the 182 MB file is download
> >>>> within 24 000 ms with about 8 MB/s max. I cannot top that.
> >>>>
> >>>> I have tried this over and over again with curl and see that curl is
> >>>> able to saturate the entire LAN connection (100 Mbit/s).
> >>>>
> >>>> My tests are done on Windows 7 64 bit, JDK 7u67 32 bit.
> >>>>
> >>>> Any idea what the bottleneck might me?
> >>
> >> Thanks for the quick response.
> >>
> >>> (1) Curl should be using zero copy file transfer which Java blocking i/o
> >>> does not support. HttpAsyncClient on the other hand supports zero copy
> >>> file transfer and generally tends to perform better when writing content
> >>> out directly to the disk.
> >>
> >> I did try this [1] example and my heap exploaded. After increasing it to
> >> -Xmx1024M, it did saturate the entire connection.
> >>
> >
> > This sounds wrong. The example below does not use zero copy (with zero
> > copy there should be no heap memory allocation at all).
> >
> > This example demonstrates how to use zero copy file transfer
> >
> > http://hc.apache.org/httpcomponents-asyncclient-4.1.x/httpasyncclient/examples/org/apache/http/examples/nio/client/ZeroCopyHttpExchange.java
> 
> I have seen this example but there is no ZeroCopyGet. I haven't found 
> any example which explicitly says use zero-copy for GETs. The example 
> from [1] did work but with the explosion. What did I wrong here.
> 

Zero copy can be employed only if a message encloses an entity in it.
Therefore there is no such thing as ZeroCopyGet in HC. One can execute a
normal GET request and use a ZeroCopyConsumer to stream content out
directly to a file without any intermediate buffering in memory.


> >>> (2) Use larger socket / intermediate buffers. Default buffer size used
> >>> by Entity implementations is most likely suboptimal.
> >>
> >> That did not make any difference. I have changed:
> >>
> >> 1. Socket receive size
> >> 2. Employed a buffered input stream
> >> 3. Manually copied the stream to a file
> >>
> >> I have varied the buffer size from 2^14 to 2^20 bytes. No avail.
> >> Regardless of this, your tip with zero copy helped me a lot.
> >>
> >> Unfortunately, this is just a little piece in a performance degregation
> >> chain a colleague has figured out. HttpClient acts as an intermediate in
> >> a webapp which receives a request via REST from a client, processes that
> >> and opens up the stream to the huge files from a remote server. Without
> >> caching the files to disk, I am passing the Entity#getContent stream
> >> back to the client. The degreation is about 75 %.
> >>
> >> After rethinking your tips, I just checked the servers I am pulling off
> >> data. One is slow the otherone is fast. Transfer speeds with piping the
> >> streams from the fast server remains at 8 MB/s which is what I wanted
> >> after I have identified an issue with my custom HttpResponseInputStream.
> >>
> >> I modified my code to use the async client and it seems to pipe with
> >> maximum LAN speed though it looks weird with curl now. Curl blocks for
> >> 15 seconds and within a second the entire stream is written down to disk.
> >>
> >
> > It all sounds very bizarre. I see no reason why HttpAsyncClient without
> > zero copy transfer should do any better than HttpClient in this
> > scenario.
> 
> So you are saying something is probably wrong with my client setup?
> 

I think it is not unlikely.

Oleg



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscribe@hc.apache.org
For additional commands, e-mail: httpclient-users-help@hc.apache.org


Mime
View raw message