clerezza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Seaborne <a...@apache.org>
Subject Re: Problem with SingleTdbDatasetProvider
Date Wed, 04 Dec 2013 21:40:50 GMT
On 04/12/13 11:23, Reto Bachmann-Gmür wrote:
> On Tue, Dec 3, 2013 at 7:20 PM, Andy Seaborne <andy@apache.org> wrote:
>
>> On 03/12/13 15:13, Reto Bachmann-Gmür wrote:
>>
>>> So here's some code to reproduced the ConcurrentModificationException
>>> directly with jena (using transactions):
>>>
>>>           String directory = "target/Dataset1";
>>>           Dataset dataset = TDBFactory.createDataset(directory);
>>>           {
>>>               dataset.begin(ReadWrite.WRITE);
>>>               Model foo1 = ModelFactory.createDefaultModel();
>>>               foo1.add(RDFS.Class, RDF.type, RDFS.Class);
>>>               foo1.add(RDFS.Class, RDF.type, RDFS.Resource);
>>>               dataset.addNamedModel(URNXFOO1, foo1);
>>>               dataset.commit();
>>>               dataset.end();
>>>           }
>>>           {
>>>               dataset.begin(ReadWrite.WRITE);
>>>               Model foo2 = ModelFactory.createDefaultModel();
>>>               dataset.addNamedModel(URNXFOO2, foo2);
>>>               dataset.commit();
>>>               dataset.end();
>>>           }
>>>           {
>>>               dataset.begin(ReadWrite.WRITE);
>>>               Model foo1 = dataset.getNamedModel(URNXFOO1);
>>>               Model foo2 = dataset.getNamedModel(URNXFOO2);
>>>               StmtIterator iter = foo1.listStatements();
>>>               while (iter.hasNext()) {
>>>                   foo2.add(iter.nextStatement());
>>>               }
>>>
>>
>> You are modifying the dataset while iterating over it.
>>
>> Model foo2 is just a view of the dataset, not an isolated copy.  'fraid
>> you can't modify the dataset and get a consistent view for the iterator at
>> the same time.  The code just happens to check otherwise there would be
>> non-deterministic behaviour.  (The check isn't perfect BTW.)
>>
>
> I don't see why Jena can't be a bit more selective, an addition to another
> graph cannot possibly affect the current iterator. The above is what
> happens when we addAll is invoked on a clerezza graph. Isee that for
> simialr methods the jena implementation copy the graph to be added to
> memory first, it would be good to have a more efficient solution.

If want independence, put each graph one-by-one in a general dataset.

Adding to a graph affects all the quad indexes.

SPOG for example.  The quads are all mixed up.

	Andy

>
>
>>
>> This does it as well - and no transactions.
>>
>>      public static void main1() throws Exception {
>>          String directory = "target/Dataset1";
>>          String URNXFOO1 = "urn:x-foo:1" ;
>>          String URNXFOO2 = "urn:x-foo:2" ;
>>
>>          Dataset dataset = TDBFactory.createDataset();
>>          dataset.getNamedModel(URNXFOO1).add(RDFS.Class, RDF.type,
>> RDFS.Class);
>>          dataset.getNamedModel(URNXFOO1).add(RDFS.Class, RDF.type,
>> RDFS.Resource);
>>
>>
>>          Model foo1 = dataset.getNamedModel(URNXFOO1);
>>          Model foo2 = dataset.getNamedModel(URNXFOO2);
>>          StmtIterator iter = foo1.listStatements();
>>          while (iter.hasNext()) {
>>              foo2.add(iter.nextStatement());
>>          }
>>      }
>>
>
> That's more elegant ;)
>
> Reto
>
>>
>>
>>
>>          Andy
>>
>>
>>                dataset.commit();
>>>               dataset.end();
>>>           }
>>>
>>>
>>> On Wed, Oct 30, 2013 at 2:42 PM, Reto Bachmann-Gmür <reto@wymiwyg.com
>>>> wrote:
>>>
>>>   Hi Minto
>>>>
>>>> Interesting problem you have there.
>>>>
>>>>>
>>>>>
>>>> Thanks ;)
>>>>
>>>>
>>>>> Best is to fix the dreaded ConcurrentModificationException.
>>>>> Occasionally
>>>>> I run into it it als well. But most probably it is not that trivial to
>>>>> solve.
>>>>>
>>>>>
>>>> On hand we had some concurrency issues that caused the exception. I fixed
>>>> (some of) them recently, the tests are now passing.
>>>>
>>>> On the other hand the exception is coming from Jena (as well as as from
>>>> java collections) if the dataset (respectively the collection) is
>>>> modified
>>>> while iterating over it. Inthe case of addAll the modification of the
>>>> underlying dataset is necessary, so this is not about some
>>>> timing/concurrency.
>>>>
>>>> Cheers,
>>>> Reto
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> The only work around that I see is copy the MGraph to a different
>>>>> provider and do the normal addAll(). This other provider does not
>>>>> necessarily need to be in memory. Basically it is 2x addAll(). One to
a
>>>>> different provider and one back.
>>>>>
>>>>> My knowledge is too limited to comment on forwarding it to Jena.
>>>>>
>>>>> Hope this is of some help to you.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Minto
>>>>>
>>>>> Reto Bachmann-Gmür schreef op 30-10-2013 12:46:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> I'm having a problem using addAll two add one Mgraph from
>>>>>> SingeTdbDatasetProvider to another such MGraph.
>>>>>>
>>>>>> The problem is that the iterator over that added graph will return
a
>>>>>> ConcurrentModificationException as soon as a triple has been added
to
>>>>>>
>>>>> the
>>>>>
>>>>>> target graph.
>>>>>>
>>>>>> I don't know how to solve this. Copying the graph to be added to
memory
>>>>>> doesn't seem to be a compealing solution. Maybe the add-all could
be
>>>>>> forwarded to Jena but this would solve the problem only in some cases,
>>>>>>
>>>>> not
>>>>>
>>>>>> if there is any wrapper on the added graph or if union-graphs are
used.
>>>>>>
>>>>>> Cheers,
>>>>>> Reto
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> ir. ing. Minto van der Sluis
>>>>> Software innovator / renovator
>>>>> Xup BV
>>>>>
>>>>> Mobiel: +31 (0) 626 014541
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


Mime
View raw message