jena-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Castagna <castagna.li...@googlemail.com>
Subject Re: Concurrent updates in TDB
Date Sun, 06 Feb 2011 16:00:52 GMT
Andy Seaborne wrote:
> 
> 
> On 03/02/11 22:24, Paolo Castagna wrote:
>> Frank Budinsky wrote:
>>> Hi,
>>>
>>> In a previous exchange Damian told me:
>>>
>>>> You can't write
>>>> to the same TDB store from different processes.
>>>
>>> I'm wondering if there are any safe exceptions? For example, is it
>>> safe if
>>> one process always adds/removes/updates statements in named graphs, 
>>> while
>>> the other process works exclusively in the default graph (i.e., the
>>> graphs
>>> being used by the two processes are completely independent)? Is this
>>> safe,
>>> or am I treading on thin ice by using the same TDB store for both
>>> processes?
>>>
>>> Thanks,
>>> Frank
>>
>> For me (and us @ Talis) one concern is that MRSW (i.e. Multiple Readers
>> Single
>> Writer) locking necessary is a MR xor SW (i.e. exclusive or). So, a very
>> long
>> write operation can actually stop others reading until the write has
>> finished.
>>
>> So, if you allow big/long updates, you need to carefully consider
>> alternatives
>> to avoid this problem.
>>
>> A slower(?) (than native TDB) alternative could be TDB-BDB:
>> https://github.com/afs/TDB-BDB
>>
>> Having a systems with multiple replica helps as well.
>>
>> Thinking other alternative approaches [1] is too scary for me at this
>> time, but
>> it would be good to list and describe them (just in case there are
>> people keen
>> to help on this) or share good papers to read which describe an approach
>> which
>> is compatible with TDB design.
> 
> Journalled file access.  Small matter of programming (it fits the 
> current design).  Phase trees are also possible (but don't have the same 
> recoverability).

Hi Andy,
I've never done anything like this before, could you share with me/us a 
little bit more details?

If it is a small matter of programming, I (or others) could help in 
getting this done and it would probably benefit many TDB users.

I've just opened a new feature request, let's discuss/comment there:
https://issues.apache.org/jira/browse/JENA-41

Paolo

> 
> Or break the writes up into smaller blocks.
> 
> If it's a long write, then if it's the app that slow, journalling wins. 
>  If it's the fact a lot of data is being written, well, there is rather 
> less one can do without partial locking (which is expensive for 
> everything else); it's rather hard in RDF to know what's "unrelated".
> 
>     Andy
> 
>>
>> Paolo
>>
>> [1] http://en.wikipedia.org/wiki/Multiversion_concurrency_control

Mime
View raw message