Return-Path: X-Original-To: apmail-clerezza-dev-archive@www.apache.org Delivered-To: apmail-clerezza-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5D8789F88 for ; Thu, 14 Mar 2013 13:44:03 +0000 (UTC) Received: (qmail 79542 invoked by uid 500); 14 Mar 2013 13:44:03 -0000 Delivered-To: apmail-clerezza-dev-archive@clerezza.apache.org Received: (qmail 79345 invoked by uid 500); 14 Mar 2013 13:43:59 -0000 Mailing-List: contact dev-help@clerezza.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@clerezza.apache.org Delivered-To: mailing list dev@clerezza.apache.org Received: (qmail 79303 invoked by uid 99); 14 Mar 2013 13:43:57 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 13:43:57 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of andy.seaborne.apache@gmail.com designates 74.125.82.180 as permitted sender) Received: from [74.125.82.180] (HELO mail-we0-f180.google.com) (74.125.82.180) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 13:43:49 +0000 Received: by mail-we0-f180.google.com with SMTP id k14so2115049wer.11 for ; Thu, 14 Mar 2013 06:43:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=n8ckQDlPCeHS1aaQouxWrA8KfFrejN7dBhpB63xkeV4=; b=vDCovqcsHyJBsWRlbtbKwBRN6BKR56vXLepbJObbLzMNp2NIbMgqu2Y6zt+IVPue7E OsI6ARu7znZNeS9PfnBLOepErFf1VZHUB588ucK1fgUkeFBDVAaDWTgjzi7OZv35EEz7 gg3P0lIh1lTkdIGAO0krLUaGU9rDuqhE6HIGMcAJ9izfyySFSUPd+2lK7oy/KYgKAXRa CO0NGRiA/JzyLgACuHtM40yY9mU3ZhyQCLTg9LRoZ/L3CapUk28QLuX5pnURjNrQmplQ TZ+1Y+UMKGfQW9iAgFdA293VHnbYXmBYp2vGEoSjEuznVpDGGZ5GHbYWojAzXHaDmxAZ HhaQ== X-Received: by 10.180.94.133 with SMTP id dc5mr13979723wib.1.1363268609020; Thu, 14 Mar 2013 06:43:29 -0700 (PDT) Received: from [192.168.0.10] (cpc37-aztw23-2-0-cust35.18-1.cable.virginmedia.com. [94.174.128.36]) by mx.google.com with ESMTPS id n2sm9079182wiy.6.2013.03.14.06.43.27 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 14 Mar 2013 06:43:27 -0700 (PDT) Sender: Andy Seaborne Message-ID: <5141D3FE.3040805@apache.org> Date: Thu, 14 Mar 2013 13:43:26 +0000 From: Andy Seaborne User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130308 Thunderbird/17.0.4 MIME-Version: 1.0 To: dev@clerezza.apache.org Subject: Re: ConcurrentModicationException on TDB storage provider (SingleDataset) References: <5140A9CD.1010208@xup.nl> <51418F5B.60002@xup.nl> <51419AE4.6070600@xup.nl> In-Reply-To: <51419AE4.6070600@xup.nl> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org On 14/03/13 09:39, Minto van der Sluis wrote: > Rupert, > > Thanks for the additional explanation. > > Regards, > > Minto > > Op 14-3-2013 10:31, Rupert Westenthaler schreef: >> Hi Minto >> >> I am traveling this week and do not have time to work on this until >> the weekend but I will have a look into this. >> >> Let me try to explain my concern again and make it more clear: >> >> The Jena TDB named graphs are hold in a single quad store table (SPOC >> - Subject Predicate Object Context). On the Clerezza side you have a >> TripleCollections (SPO) with a name (C). What that means is that all >> Clerezza TripleCollections provided by the same >> SingleTdbDatasetTcProvider do share the same SPOC table. meaning that >> a change of any of those TripleCollections will cause a modification >> in the Jena TDB Backend. This means that Iterators of all >> TripleCollections need to make a ReadLock on the SPOC table (and not >> only on the SPO section represented by the TripleCollection). >> >> While Clerezza allows to build a LockableMGraphWrapper over an MGrpah >> this is not sufficient for the SingleTdbDatasetTcProvider as this will >> only protect the SPO section and not the SPOC table used by the >> backend. So changes in other graphs - or the creation of a new graph - >> are still possible and will cause ConcurrentModificationExceptions as >> reported. >> >> To solve this issue one needs to ensure that a single ReadWrite lock >> is used for all TripleCollections provided by the >> SingleTdbDatasetTcProvider as this will allow users to lock the whole >> SPOC table of the backend when they perform operations on the Clerezza >> TripleCollections. A TDB dataset provides a single Lock you can reuse/wrap so all the graph locks are related when needed. The GraphTDB.getLock() is the dataset lock. Transactions would be better. Better concurrency (concurrent writer and multiple readers). Andy >> >> best >> Rupert >> >> >> On Thu, Mar 14, 2013 at 9:50 AM, Minto van der Sluis wrote: >>> Hi, >>> >>> Half of what the 2 of you write is not very clear to me. Probably due to >>> being a novice when it comes to Clerezza internals. >>> >>> Maybe I will start with giving CLEREZZA-726 another try and then check >>> if I still get these exceptions. >>> >>> Regard, >>> >>> Minto >>> >>> Op 13-3-2013 18:35, Reto Bachmann-Gmür schreef: >>>> On Wed, Mar 13, 2013 at 6:04 PM, Rupert Westenthaler < >>>> rupert.westenthaler@gmail.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> I think that this is cased by the fact that if you create a >>>>> LockableMGraph over MGraphs provided by the SingleTdbDatasetTcProvider >>>>> you end up in a situation where you have multiple ReadWrite Locks on >>>>> the same quad store (the Jena TDB dataset). This means that acquiring >>>>> a write lock on one MGraph will not prohibit changes in other graphs - >>>>> or the creation of new graphs. Because of that you will end up with >>>>> ConcurrentModificationException when using iterators over triples >>>>> (such as going over SPARQL results). >>>>> >>>> True. But where is the graph locked in the first place? It should aquire a >>>> lock before iterating though the graph, does this happen? >>>> >>>> cheers, >>>> reto >>>> >>>>> The solution would be to >>>>> >>>>> * create a single ReadWirte lock for the SingleTdbDatasetTcProvider >>>>> * replace all synchronized(dataset){..} block with read/wirte locks >>>>> * all methods returning MGraphs need to return LockableMGraph >>>>> instances that do use the ReadWrite lock used by the >>>>> SingleTdbDatasetTcProvider >>>>> * users would than need to use the LockableMGraph instance provided by >>>>> the provider and NOT wrap those with an other LockableMGraph instance >>>>> (e.g. the LockableMGraphWrapper). >>>>> >>>>> best >>>>> Rupert >>>>> >>>>> >>>>> On Wed, Mar 13, 2013 at 5:31 PM, Minto van der Sluis wrote: >>>>>> Hi Folks, >>>>>> >>>>>> I ran into an issue is both the existing SingleTdbDatasetTcProvider and >>>>>> my customized version (see CLEREZZA-736). >>>>>> >>>>>> How to reproduce: >>>>>> 1) Have some process constantly inject new named graphs (I had a process >>>>>> injecting 1000 named graphs) >>>>>> 2) perform a query while 1 is still running. I used the following query: >>>>>> >>>>>> SELECT ?graphName WHERE { GRAPH ?graphName {} } LIMIT 10 OFFSET 0 >>>>>> >>>>>> 3) repeat step 2 a number of times (since the error does not always >>>>> occur) >>>>>> This results in a ConcurrentModificationException (see stacktrace >>>>>> below). I am not sure whether this is a Clerezza or Jena issue. >>>>>> >>>>>> Anyone an idea what is causing this? Or more importantly how to fix it? >>>>>> >>>>>> Should I create a Jira issue for this? >>>>>> >>>>>> Regards, >>>>>> >>>>>> -- >>>>>> ir. ing. Minto van der Sluis >>>>>> Software innovator / renovator >>>>>> Xup BV >>>>>> >>>>>> >>>>>> Stacktrace: >>>>>> java.util.ConcurrentModificationException: Iterator: started at 7103, >>>>> now 7105 >>>>>> at >>>>> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157) >>>>>> at >>>>> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.access$000(DatasetControlMRSW.java:32) >>>>>> at >>>>> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.checkCourrentModification(DatasetControlMRSW.java:110) >>>>>> at >>>>> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(DatasetControlMRSW.java:118) >>>>>> at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295) >>>>>> at >>>>> com.hp.hpl.jena.tdb.store.GraphTDBBase$ProjectQuadsToTriples.hasNext(GraphTDBBase.java:173) >>>>>> at >>>>> com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76) >>>>>> at >>>>> org.apache.clerezza.rdf.jena.storage.JenaGraphAdaptor$1.hasNext(JenaGraphAdaptor.java:106) >>>>>> at >>>>> org.apache.clerezza.rdf.core.impl.AbstractTripleCollection$1.hasNext(AbstractTripleCollection.java:78) >>>>>> at >>>>> org.apache.clerezza.rdf.core.access.LockingIterator.hasNext(LockingIterator.java:47) >>>>>> at >>>>> org.apache.clerezza.rdf.jena.facade.JenaGraph$1.hasNext(JenaGraph.java:95) >>>>>> at >>>>> com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:151) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:79) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIterBlockTriples.hasNextBinding(QueryIterBlockTriples.java:64) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.main.iterator.QueryIterGraph$QueryIterGraphInner.hasNextBinding(QueryIterGraph.java:123) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:79) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:59) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112) >>>>>> at >>>>> com.hp.hpl.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:72) >>>>>> at >>>>> org.apache.clerezza.rdf.jena.sparql.ResultSetWrapper.(ResultSetWrapper.java:39) >>>>>> at >>>>> org.apache.clerezza.rdf.jena.sparql.JenaSparqlEngine.execute(JenaSparqlEngine.java:68) >>>>>> at >>>>> org.apache.clerezza.rdf.core.access.TcManager.executeSparqlQuery(TcManager.java:272) >>>>>> ... >>>>>> >>>>> >>>>> -- >>>>> | Rupert Westenthaler rupert.westenthaler@gmail.com >>>>> | Bodenlehenstraße 11 ++43-699-11108907 >>>>> | A-5500 Bischofshofen >>>>> >>> >>> -- >>> ir. ing. Minto van der Sluis >>> Software innovator / renovator >>> Xup BV >>> >>> Mobiel: +31 (0) 626 014541 >>> >> >> >> -- >> | Rupert Westenthaler rupert.westenthaler@gmail.com >> | Bodenlehenstraße 11 ++43-699-11108907 >> | A-5500 Bischofshofen >> >> > >