Return-Path: X-Original-To: apmail-subversion-dev-archive@minotaur.apache.org Delivered-To: apmail-subversion-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EB68018E81 for ; Fri, 19 Feb 2016 07:35:44 +0000 (UTC) Received: (qmail 91958 invoked by uid 500); 19 Feb 2016 07:35:44 -0000 Delivered-To: apmail-subversion-dev-archive@subversion.apache.org Received: (qmail 91913 invoked by uid 500); 19 Feb 2016 07:35:44 -0000 Mailing-List: contact dev-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@subversion.apache.org Received: (qmail 91716 invoked by uid 99); 19 Feb 2016 07:35:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Feb 2016 07:35:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id C6B8BC00ED for ; Fri, 19 Feb 2016 07:35:43 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.701 X-Spam-Level: X-Spam-Status: No, score=-0.701 tagged_above=-999 required=6.31 tests=[RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id clPMB2McG-be for ; Fri, 19 Feb 2016 07:35:40 +0000 (UTC) Received: from mail51c50.megamailservers.eu (mail155c50.megamailservers.eu [91.136.10.165]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 4FDE45F1E7 for ; Fri, 19 Feb 2016 07:35:40 +0000 (UTC) X-Authenticated-User: stefanfuhrmann.alice-dsl.de Received: from [192.168.1.240] (e183083236.adsl.alicedsl.de [85.183.83.236]) (authenticated bits=0) by mail51c50.megamailservers.eu (8.14.9/8.13.1) with ESMTP id u1J7ZWLA016889; Fri, 19 Feb 2016 07:35:34 +0000 Message-ID: <56C6C630.1030302@alice-dsl.de> Date: Fri, 19 Feb 2016 08:37:20 +0100 From: Stefan Fuhrmann User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Ivan Zhakov CC: dev@subversion.apache.org Subject: Re: svn commit: r1730617 - /subversion/trunk/subversion/libsvn_repos/log.c References: <20160215214700.9EC693A07CD@svn01-us-west.apache.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-CTCH-RefID: str=0001.0A0B0204.56C6C5C6.01C5,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0 X-CTCH-VOD: Unknown X-CTCH-Spam: Unknown X-CTCH-Score: 0.000 X-CTCH-Rules: X-CTCH-Flags: 0 X-CTCH-ScoreCust: 0.000 X-CSC: 0 X-CHA: v=2.1 cv=Ssiptp+0 c=1 sm=1 tr=0 a=1bTJtU0Yvy8vfXExib1mLg==:117 a=1bTJtU0Yvy8vfXExib1mLg==:17 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=IkcTkHD0fZMA:10 a=MKtGQD3n3ToA:10 a=1oJP67jkp3AA:10 a=ZZnuYtJkoWoA:10 a=_DCmGSNqkacA:10 a=mV9VRH-2AAAA:8 a=UJ5Y5Z__AAAA:8 a=YbDwaK6Vu4yVlm2Nz2kA:9 a=QEXdDO2ut3YA:10 On 17.02.2016 15:33, Ivan Zhakov wrote: > On 16 February 2016 at 00:47, wrote: >> Author: stefan2 >> Date: Mon Feb 15 21:47:00 2016 >> New Revision: 1730617 >> >> URL: http://svn.apache.org/viewvc?rev=1730617&view=rev >> Log: >> Continue work on the svn_repos_get_logs4 to svn_repos_get_logs5 migration: >> Switch the last svn_fs_paths_changed2 call to svn_fs_paths_changed3. >> >> * subversion/libsvn_repos/log.c >> (fs_mergeinfo_changed): No longer fetch the whole changes list. However, >> we need to iterate twice for best total performance >> and we need to minimize FS iterator lifetimes. >> > > It seems that I would be -1 against this particular change. In the > current implementation the svn_fs_paths_changed3() is called twice > that in the worst case will lead to *double read from disk*. Pick your worst-case behaviour: (1) Crash the server with OOM. (2) For each change in a revision, perform a 1-step history lookup (random I/O, about 2x as much data to read as with (3)). (3) Read a linear file section twice. I went with option (3). What is your preference? > As far as I understand you're relying to the fact that the second call > will hit the FSFS/FSX cache. But there will be a significant > performance degradation comparing to the 1.9 implementation in the > case of cache miss. On a system without OS-side file cache, 'log -g' on the repository root would take at most twice as long as in 1.9. Other operations are not affected. Running it on any other directory will actually be faster in many cases with 1.10 because the new history traversal code no longer reconstructs all directories up the tree for each revision. So, hardly anything people will complain about. > As we are adding more and more of such code, more and more users > become faced with the significant performance regression (see [1] and > other cases). If you consider [1] a significant performance regression, please follow up on it and review the two relevant backports for 1.9. > Do you intend to resolve this problem in the future commits? I have > some obvious solutions in mind, but maybe you also know something > about this. The only reasonable alternative is to pick option (2) and hope that the performance regression in real-life is acceptable. > [1] http://svn.haxx.se/users/archive-2015-12/0060.shtml -- Stefan^2.