From issues-return-140232-archive-asf-public=cust-asf.ponee.io@maven.apache.org  Sat Nov 10 23:38:04 2018
Return-Path: <issues-return-140232-archive-asf-public=cust-asf.ponee.io@maven.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 36CA218067A
	for <archive-asf-public@cust-asf.ponee.io>; Sat, 10 Nov 2018 23:38:04 +0100 (CET)
Received: (qmail 55424 invoked by uid 500); 10 Nov 2018 22:38:03 -0000
Mailing-List: contact issues-help@maven.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:issues-help@maven.apache.org>
List-Unsubscribe: <mailto:issues-unsubscribe@maven.apache.org>
List-Post: <mailto:issues@maven.apache.org>
List-Id: <issues.maven.apache.org>
Reply-To: dev@maven.apache.org
Delivered-To: mailing list issues@maven.apache.org
Received: (qmail 55155 invoked by uid 99); 10 Nov 2018 22:38:03 -0000
Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142)
    by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 10 Nov 2018 22:38:03 +0000
Received: from localhost (localhost [127.0.0.1])
	by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id AD641C0336
	for <issues@maven.apache.org>; Sat, 10 Nov 2018 22:38:02 +0000 (UTC)
X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org
X-Spam-Flag: NO
X-Spam-Score: -109.501
X-Spam-Level:
X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31
	tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8,
	RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5,
	USER_IN_WHITELIST=-100] autolearn=disabled
Received: from mx1-lw-us.apache.org ([10.40.0.8])
	by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024)
	with ESMTP id cRycEtQH-4Yp for <issues@maven.apache.org>;
	Sat, 10 Nov 2018 22:38:01 +0000 (UTC)
Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139])
	by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 76784624A9
	for <issues@maven.apache.org>; Sat, 10 Nov 2018 22:38:01 +0000 (UTC)
Received: from jira-lw-us.apache.org (unknown [207.244.88.139])
	by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id BFFA5E111B
	for <issues@maven.apache.org>; Sat, 10 Nov 2018 22:38:00 +0000 (UTC)
Received: from jira-lw-us.apache.org (localhost [127.0.0.1])
	by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 52EE6266E7
	for <issues@maven.apache.org>; Sat, 10 Nov 2018 22:38:00 +0000 (UTC)
Date: Sat, 10 Nov 2018 22:38:00 +0000 (UTC)
From: "Michael Osipov (JIRA)" <jira@apache.org>
To: issues@maven.apache.org
Message-ID: <JIRA.13191533.1539595151000.320940.1541889480337@Atlassian.JIRA>
In-Reply-To: <JIRA.13191533.1539595151000@Atlassian.JIRA>
References: <JIRA.13191533.1539595151000@Atlassian.JIRA> <JIRA.13191533.1539595151132@jira-lw-us.apache.org>
Subject: [jira] [Commented] (WAGON-537) Maven transfer speed of large
 artifacts is slow due to unsuitable buffer strategy
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394


    [ https://issues.apache.org/jira/browse/WAGON-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682640#comment-16682640 ] 

Michael Osipov commented on WAGON-537:
--------------------------------------

What a great improvement! I made some simple tests form my Windows box to my NExus instance in my LAN with a gigabit link:

{noformat}
Before:
Uploaded to nexus-mika: http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/test-big-0.0.1-20181110.220953-5-big.bin (5.0 GB at 5.7 MB/s)
Downloaded from nexus-mika: http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/test-big-0.0.1-20181110.220953-5-big.bin (5.0 GB at 11 MB/s)

After:
Uploaded to nexus-mika: http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/test-big-0.0.1-20181110.215857-3-big.bin (5.0 GB at 20 MB/s)
Downloaded from nexus-mika: http://mika-ion:8081/nexus/content/repositories/snapshots/test/test-big/0.0.1-SNAPSHOT/test-big-0.0.1-20181110.214908-2-big.bin (5.0 GB at 83 MB/s)
{noformat}

The difference is insane! I have pushed a slighly modified branch. Any idea why upload is way slower than download?

> Maven transfer speed of large artifacts is slow due to unsuitable buffer strategy
> ---------------------------------------------------------------------------------
>
>                 Key: WAGON-537
>                 URL: https://issues.apache.org/jira/browse/WAGON-537
>             Project: Maven Wagon
>          Issue Type: Improvement
>          Components: wagon-http, wagon-provider-api
>    Affects Versions: 3.2.0
>         Environment: Windows 10, JDK 1.8, Nexus  Artifact store > 100MB/s network connection.
>            Reporter: Olaf Otto
>            Assignee: Michael Osipov
>            Priority: Major
>              Labels: perfomance
>             Fix For: 3.2.1
>
>         Attachments: wagon-issue.png
>
>
> We are using maven for build process automation with docker. This sometimes involves uploading and downloading artifacts with a few gigabytes in size. Here, maven's transfer speed is consistently and reproducibly slow. For instance, an artifact with 7,5 GB in size took almost two hours to transfer in spite of a 100 MB/s connection with respective reproducible download speed from the remote nexus artifact repository when using a browser to download. The same is true when uploding such an artifact.
> I have investigated the issue using JProfiler. The result shows an issue in AbstractWagon's transfer( Resource resource, InputStream input, OutputStream output, int requestType, long maxSize ) method used for remote artifacts and the same issue in AbstractHttpClientWagon#writeTo(OutputStream).
> Here, the input stream is read in a loop using a 4 Kb buffer. Whenever data is received, the received data is pushed to downstream listeners via fireTransferProgress. These listeners (or rather consumers) perform expensive tasks.
> Now, the underlying InputStream implementation used in transfer will return calls to read(buffer, offset, length) as soon as *some* data is available. That is, fireTransferProgress may well be invoked with an average number of bytes less than half the buffer capacity (this varies with the underlying network and hardware architecture). Consequently, fireTransferProgress is invoked *millions of times* for large files. As this is a blocking operation, the time spent in fireTransferProgress dominates and drastically slows down the transfers by at least one order of magnitude. 
> !wagon-issue.png! 
> In our case, we found download speed reduced from a theoretical optimum of ~80 seconds to to more than 3200 seconds.
> From an architectural perspective, I would not want to make the consumers / listeners invoked via fireTransferProgress aware of their potential impact on download speed, but rather refactor the transfer method such that it uses a buffer strategy reducing the the number of fireTransferProgress invocations. This should be done with regard to the expected file size of the transfer, such that fireTransferProgress is invoked often enough but not to frequent.


--
This message was sent by Atlassian JIRA
(v7.6.3#76005)