Return-Path: X-Original-To: apmail-ignite-dev-archive@minotaur.apache.org Delivered-To: apmail-ignite-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CB83318C39 for ; Wed, 7 Oct 2015 19:38:40 +0000 (UTC) Received: (qmail 29454 invoked by uid 500); 7 Oct 2015 19:38:37 -0000 Delivered-To: apmail-ignite-dev-archive@ignite.apache.org Received: (qmail 29413 invoked by uid 500); 7 Oct 2015 19:38:37 -0000 Mailing-List: contact dev-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list dev@ignite.apache.org Received: (qmail 29396 invoked by uid 99); 7 Oct 2015 19:38:37 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Oct 2015 19:38:37 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 2B2A01A2367 for ; Wed, 7 Oct 2015 19:38:37 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0 X-Spam-Level: X-Spam-Status: No, score=0 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=comcast.net Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Z5d6HvKC7iBz for ; Wed, 7 Oct 2015 19:38:31 +0000 (UTC) Received: from resqmta-po-08v.sys.comcast.net (resqmta-po-08v.sys.comcast.net [96.114.154.167]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id A4560204D9 for ; Wed, 7 Oct 2015 19:38:30 +0000 (UTC) Received: from resomta-po-05v.sys.comcast.net ([96.114.154.229]) by resqmta-po-08v.sys.comcast.net with comcast id SKeD1r0084xDoy801KePdB; Wed, 07 Oct 2015 19:38:23 +0000 Received: from tpx ([24.130.135.131]) by resomta-po-05v.sys.comcast.net with comcast id SKeM1r00U2qGB6001KeNQr; Wed, 07 Oct 2015 19:38:23 +0000 Received: from localhost (localhost [127.0.0.1]) by tpx (Postfix) with ESMTP id B397C2005CB44 for ; Wed, 7 Oct 2015 22:38:21 +0300 (MSK) Date: Wed, 7 Oct 2015 22:38:21 +0300 From: Konstantin Boudnik To: dev@ignite.apache.org Subject: Re: IGFS concurrency issue Message-ID: <20151007193821.GC5947@tpx> References: <20151007035739.GA16172@boudnik.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="nHwqXXcoX0o6fKCv" Content-Disposition: inline In-Reply-To: X-Organization: It's something of 'Cos X-PGP-Key: http://www.boudnik.org/~cos/pubkey.asc User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1444246703; bh=W54C0crhu2z8uMl1RRhj4HKz8e20ClWMKuU5iYY9Jyk=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=UMzDDW2hdqs7uSxliNvYSn3zUsZVn/k7jayh2FccXxJBtMepctvgOGmJdUqgDLHSx 9IlY2j/McGoQoRG2XlyV4UNcW8dgMWMSWrONJphBizIeyS6w4DHtEiYMa+R6gBOG7F yhKk1BMsHsDBvqfjXQ/QlhNi1BAMTPRHutFBbvrB5kld1LRd464VZXrOug7yiqgZFj nnQ1D8ZKOxfF8XbJnOsEXqW/C40e78DdSrr0Ue3oZv2LmwhvK8lVt4o2CrDxARB25z uImxqLCCYewi9vxoeKz/ee9VEGBQcBvZhlk6C8rtAX4VEqceA81jr2vA8ecbusy7om BfMfIXubgg0IQ== --nHwqXXcoX0o6fKCv Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Oct 07, 2015 at 09:11AM, Vladimir Ozerov wrote: > Cos, > Yes, no long-time locking is expected here. Sorry, I musta be still dense from the jet-lag: could you put in a bit more details? Thanks in advance! Cos > On Wed, Oct 7, 2015 at 6:57 AM, Konstantin Boudnik wrote: >=20 > > IIRC NN should be locking on these ops anyway, shouldn't it? The situat= ion > > is > > no different if multiple clients are doing these operations > > near-simultaneously. Unless I missed something here... > > > > On Thu, Sep 24, 2015 at 11:28AM, Sergi Vladykin wrote: > > > May be just check that they are not parent-child within the tx? > > > > > > Sergi > > > Igniters, > > > > > > We revealed concurrency problem in IGFS and I would like to discuss > > > possible solutions to it. > > > > > > Consider the following file system structure: > > > root > > > |-- A > > > | |-- B > > > | | |-- C > > > | |-- D > > > > > > ... two concurrent operations in different threads: > > > T1: move(/A/B, /A/D); > > > T2: move(/A/D, /A/B/C); > > > > > > ... and how IGFS handles it now: > > > T1: verify that "/A/B" and "/A/D" exist, they are not child-parent to > > each > > > other, etc. -> OK. > > > T2: do the same for "A/D" and "A/B/C" -> OK. > > > T1: get IDs of "/A", "/A/B" and "/A/D" to lock them later inside tx. > > > T2: get IDs of "/A", "/A/D", "/A/B" and "/A/B/C" to lock them later > > inside > > > tx. > > > > > > T1: Start pessimistic tx, lock IDs of "/A", "/A/B", "/A/D", perform m= ove > > -> > > > OK. > > > root > > > |-- A > > > | |-- D > > > | | |-- B > > > | | | |-- C > > > > > > T2: Start pessimistic tx, lock IDs of "/A", "/A/D", "/A/B" and > > > "/A/B/C" (*directory > > > structure already changed at this time!*), perform move -> OK. > > > root > > > |-- A > > > B > > > |-- D > > > | |-- C > > > | | |-- B (loop!) > > > > > > File system is corrupted. Folders B, C and D are not reacheable from > > root. > > > > > > To fix this now we additionaly check if directory structure is still > > > valid *inside > > > transaction*. It works, no more corruptions. But it requres taking lo= cks > > on > > > the whole paths *including root*. So move, delete and mkdirs opeartio= ns > > *can > > > no longer be concurrent*. > > > > > > Probably there is a way to relax this while still ensuring consistenc= y, > > but > > > I do not see how. One idea is to store real path inside each entry. T= his > > > way we will be able to ensure that it is still at a valid location > > without > > > blocking parents, so concurrnecy will be restored. But we will have to > > > propagate strucutral changes to children. E.g. move of a folder with = 100 > > > items will lead to update of >100 cache entries. Not so good. > > > > > > Any other ideas? > > > > > > Vladimir. > > --nHwqXXcoX0o6fKCv Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAEBAgAGBQJWFXStAAoJEKtmQW7Qw4JPmgkH/jBtUq/ln+tZanU9BEMVnZ11 OBdPQW9zbaAt4ycQKR6uFJb7iDI/6+Khz4tY1Rf17nfsg5xBIlMQaNHJ4J4/Hqua rEuGmQG3hx1ISFkYVLRqravsO507EZVHYm4BJk/1H0mPJB4pfdlDSJ6REkbG7DgH ck5EilWPjrb5DGuaTGZWbiVsNf2jSvSXvBJQowj9gGCUZdwkiy0H3qeckjGVkC0H C/po672mpU+skC76zP+TJ4QM5thJNr11AN84J9iMjgHSyTTw5/GJjFwI+fVPi9Et tpYTM7DW6iwA269c4oJIPGz0PjEiZgpH1s/q/mNaUExO8NXGWVtqgnftaGxRKr4= =hsE/ -----END PGP SIGNATURE----- --nHwqXXcoX0o6fKCv--