Return-Path: X-Original-To: apmail-subversion-dev-archive@minotaur.apache.org Delivered-To: apmail-subversion-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9C4AD18392 for ; Fri, 5 Feb 2016 09:28:13 +0000 (UTC) Received: (qmail 72251 invoked by uid 500); 5 Feb 2016 09:28:03 -0000 Delivered-To: apmail-subversion-dev-archive@subversion.apache.org Received: (qmail 72195 invoked by uid 500); 5 Feb 2016 09:28:03 -0000 Mailing-List: contact dev-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@subversion.apache.org Received: (qmail 72178 invoked by uid 99); 5 Feb 2016 09:28:03 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Feb 2016 09:28:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 2880F1805D1 for ; Fri, 5 Feb 2016 09:28:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.802 X-Spam-Level: X-Spam-Status: No, score=-0.802 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=visualsvn.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id tUtMINbneQ7X for ; Fri, 5 Feb 2016 09:28:01 +0000 (UTC) Received: from mail-io0-f177.google.com (mail-io0-f177.google.com [209.85.223.177]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 9405342BC3 for ; Fri, 5 Feb 2016 09:28:01 +0000 (UTC) Received: by mail-io0-f177.google.com with SMTP id f81so122287093iof.0 for ; Fri, 05 Feb 2016 01:28:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=visualsvn.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=7+58tNzTAiIlU/aXUdEaHcsLZl5QCFd1jEVrySeQt3A=; b=hi05syebmNVWcH90VP/fUtwBSiP0emmzNP+OmZvOTc3rv8+UVdGKtCEkszM2vE5l+/ BFGrKRCQ77hT2RvDcHzoz6ga2C4MMeVLdFgPaoojVb/nDfr0HbOwqrIcBExjx0afXyb5 DpOIu/5UKn9IKmdYyC2r9PlbzFHgiZdJd78zg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type:content-transfer-encoding; bh=7+58tNzTAiIlU/aXUdEaHcsLZl5QCFd1jEVrySeQt3A=; b=jDAu8XL7HZVF/hvjxRwSqBKCzLFLP3hm28Yi9fbJs4LZvhYZFqz3W2IBq+C5gVmonP uzH8PM5aURYfEO+6NO/mT8XujlZxFdsuquK8xkFdUK3APfRY6EviIvg3as0IjlclkAm6 st4FwFPQocEbT+et09zXNmgefJGgksxJ4NE/5DEV3mF346n17I0yq6UQZiMjvj5jRN4K xeOVIbskQpERlNK2YTP1btI5ULsnidLdAEktZNtsbHR9ziNTnrZjOlasCa6evh2YQX7K Nxe8fg0WQ89mUkOxhpX/R9XseQb83hWs5CBdMyF0ZIcA5oLJ91lts3r1CHicWV8eEJpa LFGg== X-Gm-Message-State: AG10YOTUIJ0veH1IL5cNghQk978/tGXyXfE10Q8XF88nDwbrbSpX0op9tyuQbg6F1CKbypWahp++hsz2F/cQr6I0 X-Received: by 10.107.32.2 with SMTP id g2mr12758844iog.39.1454664481197; Fri, 05 Feb 2016 01:28:01 -0800 (PST) MIME-Version: 1.0 Received: by 10.64.128.227 with HTTP; Fri, 5 Feb 2016 01:27:41 -0800 (PST) In-Reply-To: <56ACB185.4020506@apache.org> References: <56A377B4.9090903@alice-dsl.de> <56ACB185.4020506@apache.org> From: Evgeny Kotkov Date: Fri, 5 Feb 2016 12:27:41 +0300 Message-ID: Subject: Re: Merging parallel-put to /trunk To: Stefan Fuhrmann Cc: Subversion Development Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Stefan Fuhrmann writes: > The extra temporary space is not a concern: Your server would run out of > disk space just one equally large revision earlier than it does today. I wouldn't say it is not a concern at all =E2=80=94 e.g., in the situation = where a user cannot possibly commit a 4 GB file just because doing so now requires at least 8 GB of free disk space. While it might sound like an edge case, this could be important for some of the users. > Shall I just enable the feature unconditionally? I'm not sure about this. The feature has a price, and there are cases when enabling parallel writes has a visible performance impact. Below are my results for a couple of quick tests: (First two tests should be reproducible, since they were performed on an Azure VM; last one was done on a spinning disk in my environment; all tests were executed over https:// protocol.) Importing 2000 files of Subversion's source code: 22.233 =E2=86=92 30.546 s (37% slower) Importing a 300 MB .zip file: 36.650 s =E2=86=92 46.255 s (26% slower) Importing a 4 GB .iso file: 159.372 s =E2=86=92 212.559 s (33% slower) After giving all this topic a second thought, I wonder whether we are headi= ng in the right direction. We aim for a faster svn commit over high-latency networks. In order to achieve that, we try to implement the parallel PUTs, beginning from the FS layer. This leaves a couple of questions: (1) Why do we start with adding a quite complex FS feature, given that we don't know what kind of problems are associated with implementing this in ra_serf? (Can we actually do it? What can be parallelized while keeping the necessary order of operations on the transaction? How do we plug that into the commit editor? As well as that currently HTTP/2 is not officially supported by neither httpd nor serf.) (2) Is making parallel PUTs the proper way to speed up commits? As far as I know, squashing everything into a single POST would make th= e commit up to 10-20 times faster, depending on the amount of changes. Although there are associated challenges, this approach doesn't require us to deal with concurrency and doesn't introduce a dependency on httpd= . How faster is a commit going to be with parallel PUTs? Would that be at least twice faster? Even if yes, that would require us to keep the non-trivial code that is prone to deadlocks and different types of race conditions. For instance, transaction.c is quite complex by itself and already contains a mechanism to *prevent* concurrent writes. Adding a layer that allows concurrent writes *on top of that* makes it even more complex. So, are we sure that we need to implement it this way? Regards, Evgeny Kotkov