jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Guggisberg" <stefan.guggisb...@gmail.com>
Subject Re: Google Summer of Code project for Jackrabbit
Date Fri, 26 May 2006 08:56:59 GMT
hi nico

On 5/25/06, Nicolas Toper <ntoper@gmail.com> wrote:
> Just to summarize everything we have said on this issue.
>
> There are two kinds of lock: the jcr.Lock and the
> EDU.oswego.cs.dl.util.concurrent.*. The two are somewhat not related. Am I
> correct?
>
> There are no issues with jcr.Lock (we can still read a node).
>
> We need some mutex to avoid inconsistant IO operations using the
> util.concurrent package. I like Tobias approach to add a proxyPM. It  seems
> easy. But is this solution elegant enough and maintenable in the long run?
> Would it help us later? (I think so since it would allow delayed write which
> open the way for a 2 phase locking algorithm.) I am not in this project
> since long enough to judge :p
>
> Why didn't Jackrabbit go for serializable transaction by the way? I have
> checked the code and it seems we have all the needed kind of locks to
> support 2PL (out of scope of the current project of course).
>
> If we plan to support serializable transaction soon, then case 2 is
> acceptable. Is this the case?
>
> About Tobias ProxyPM: I am ok to write it although it is out of scope of the
> initial project, you all seem to really need it, so let's go for it. Jukka?
>
> For a specific workspace, I would still allow read operations from other
> sessions and isolate all write access (this way there will be no conflict).
> I can even make persistant the modification using an already existing PM in
> case of crash. One question though: I cannot guarantee the transaction would
> be later committed without exception. We can choose to ignore this issue or
> add an asynchronous way to warn the session. What are your thoughts on this?
>

we already have this scenario. a session's modifications are
potentially committed
asynchronously and the commit can fail for a number of reasons. that's
fine with me.

cheers
stefan

> This means a modification in the core package. Are you all OK with this?
>
>
> By the way, this kind of algorithm is called a pessismistic receiver based
> logging message algorithm. We use it in distributed systems.
>
>
>
> Thanks for your support and ideas.
> nico
> My blog! http://www.deviant-abstraction.net !!
>
>
>
>
> On 5/25/06, Tobias Bocanegra < tobias.bocanegra@day.com> wrote:
> >
> > i think there is a consensus of what backup levels there can be:
> >
> > 1) read locked workspaces
> > 2) write locked workspaces
> > 3) hot-backup (i.e. "SERIALIZABLE" isolation)
> >
> > in case 1, the entire workspace is completely locked (rw) and no one
> > else than the backup-session can read-access the workspace. this is
> > probably the easiest to implement and the least desirable.
> >
> > in case 2, the entire workspace becomes read-only, i.e. is
> > write-locked. so all other sessions can continue reading the
> > workspace, but are not allowed to write to it. this is also more or
> > less easy to implement, intoducing a 'global' lock in the lock
> > manager.
> >
> > in case 3, all sessions can continue operations on the workspace, but
> > the backup-session sees a snapshot view of the workspace. this would
> > be easy, if we had serializable isolated transactions, which we don't
> > :-(
> >
> > for larger productive environments, only case 3 is acceptable. the way
> > i see of how to impement this, is to create a
> > 'proxy-persistencemanager' that sits between the
> > shareditemstatemanager and the real persistencemanager. during normal
> > operation, it just passes the changes down to the real pm, but in
> > backup-mode, it keeps an own storage for the changes that occurr
> > during the backup. when backup is finished, it resends all changes
> > down to the real pm. using this mechanism, you have a stable snapshot
> > of the states in the real pm during backup mode. the export would then
> > access directly the real pm.
> >
> > regards, toby
> >
> >
> > On 5/25/06, Nicolas Toper < ntoper@gmail.com> wrote:
> > > Hi David,
> > >
> > > Sorry to have been unclear.
> > >
> > > What I meant is we have two different kinds of backup to perform.
> > >
> > > In one use case I call "regular backup", it is the kind of backup you
> > > perform every night. You do not care not to grab the content just
> > updated,
> > > since you will have it the day after.
> > >
> > > In the other use case I call "exceptional backup", you want to have all
> > the
> > > data because for instance you will destroy the repository afterwards.
> > >
> > > Those two differs I think in small points. For instance, for "regular
> > > backup", we don't care about transaction started but not committed. In
> > the
> > > second one, we do.
> > >
> > > I propose to support only the first use case. The second one would be
> > added
> > > easily later.
> > >
> > > I don't know how JackRabbit is used in production environment. Is it
> > > feasible to lock workspace once at a time or it is too cumbersome for
> > the
> > > customer?
> > >
> > > For instance, if backuping a workspace needs a two minutes workspace
> > > locking, then it can be done without affecting availibility (but it
> > would
> > > affect reliability). We need data to estimate if it is needed. Can you
> > give
> > > me the size of a typical workspace please?
> > >
> > > I am OK to record the transaction and commit it after the locking has
> > > occured but this means changing the semantic of Jackrabbit (a
> > transaction
> > > initiated when a lock is on would be performed after the lock is
> > released
> > > instead of raising an exception ) and I am not sure everybody would
> > think it
> > > is a good idea. We would need to add a transaction log (is there one
> > > already?) and parse transaction to detect conflict (or capture exception
> >
> > > maybe). We would not be able to guarantee anymore a transaction is
> > > persistent and it might have an impact on performance. And what about
> > time
> > > out when running a transaction?
> > >
> > > Another idea would be: monitor Jackrabbit and launch the backup when we
> > have
> > > a high probability no transaction are going to be started. But I think
> > > sysadmin already know when load is minimal on their system.
> > >
> > > Another idea would be as Miro stated, use more "lower" level strategy
> > > (working on the DB level or directly on the FS). It was actually my
> > first
> > > backup strategy but Jukka thought have to be able to use the tool to
> > migrate
> > > from one PM to another
> > >
> > > Here is my suggestion on the locking strategy: we can extend the backup
> > tool
> > > later if needed. Right now even with a global lock, it is an improvement
> > > compared to the current situation. And I need to release the project
> > before
> > > August 21.
> > >
> > > I would prefer to start with locking one workspace at a time and if I
> > have
> > > still time then find a way to work with minimal lock. I will
> > most  probably
> > > keep working on Jackrabbit after the Google SoC is over. Are you OK with
> > > this approach?
> > >
> > > We are OK on the restore operation. Good idea for the replace or ignore
> > > option but I would recommend to build it only for existing nodes :p
> > > Properties might be more difficult to handle and not as useful (and it
> > > raises a lot more questions).
> > >
> > > nico
> > > My blog! http://www.deviant-abstraction.net !!
> > >
> > >
> >
> >
>
>

Mime
View raw message