jena-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Seaborne <andy.seabo...@epimorphics.com>
Subject Re: Concurrent updates in TDB
Date Thu, 03 Feb 2011 22:46:53 GMT


On 03/02/11 22:24, Paolo Castagna wrote:
> Frank Budinsky wrote:
>> Hi,
>>
>> In a previous exchange Damian told me:
>>
>>> You can't write
>>> to the same TDB store from different processes.
>>
>> I'm wondering if there are any safe exceptions? For example, is it
>> safe if
>> one process always adds/removes/updates statements in named graphs, while
>> the other process works exclusively in the default graph (i.e., the
>> graphs
>> being used by the two processes are completely independent)? Is this
>> safe,
>> or am I treading on thin ice by using the same TDB store for both
>> processes?
>>
>> Thanks,
>> Frank
>
> For me (and us @ Talis) one concern is that MRSW (i.e. Multiple Readers
> Single
> Writer) locking necessary is a MR xor SW (i.e. exclusive or). So, a very
> long
> write operation can actually stop others reading until the write has
> finished.
>
> So, if you allow big/long updates, you need to carefully consider
> alternatives
> to avoid this problem.
>
> A slower(?) (than native TDB) alternative could be TDB-BDB:
> https://github.com/afs/TDB-BDB
>
> Having a systems with multiple replica helps as well.
>
> Thinking other alternative approaches [1] is too scary for me at this
> time, but
> it would be good to list and describe them (just in case there are
> people keen
> to help on this) or share good papers to read which describe an approach
> which
> is compatible with TDB design.

Journalled file access.  Small matter of programming (it fits the 
current design).  Phase trees are also possible (but don't have the same 
recoverability).

Or break the writes up into smaller blocks.

If it's a long write, then if it's the app that slow, journalling wins. 
  If it's the fact a lot of data is being written, well, there is rather 
less one can do without partial locking (which is expensive for 
everything else); it's rather hard in RDF to know what's "unrelated".

	Andy

>
> Paolo
>
> [1] http://en.wikipedia.org/wiki/Multiversion_concurrency_control

Mime
View raw message