directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <akaras...@apache.org>
Subject Re: [Informational] OpenLDAP Transactions
Date Fri, 04 Feb 2011 18:39:56 GMT
On Fri, Feb 4, 2011 at 10:47 AM, Howard Chu <hyc@symas.com> wrote:
> Alex Karasulu wrote:
>>
>> Hi there Howard!
>>
>> On Thu, Feb 3, 2011 at 9:56 PM, Howard Chu<hyc@symas.com>  wrote:
>>>
>>> Alex Karasulu wrote:
>>>>
>>>> FYI
>>>>
>>>> Hurray! Our respected friends at OpenLDAP are completing the
>>>> transaction spec. Nice to know of an existing implementation. Here's a
>>>> recent thread that started up on it:
>>>>
>>>>    http://www.openldap.org/lists/openldap-devel/201102/msg00005.html
>>>>
>>>> Would be interesting to see how their implementation of transactions
>>>> combines with syncRepl now in the picture. Specifically, I'm wondering
>>>> if replication will trigger on transaction boundaries, rather than on
>>>> each change in the transaction. Also wondering how change sequence
>>>> numbers will be impacted.
>>>
>>> I'm wondering too! Many open questions with this spec. Though I'll note,
>>> RFC4533 explicitly states (of syncrepl) "This protocol is not intended to
>>> be
>>> used in applications requiring transactional data consistency." (Section
>>> 1.2)
>>
>> I was hoping you guys already figured that all out :-).
>>
>>> If folks are looking for transactional consistency in replication, we
>>> should
>>> probably develop a new spec to address that.
>>
>> Seems so now, thanks for the heads up.
>
> Syncrepl only promises eventual convergence, so there's really no reasonable
> way to expect transactional consistency from it. Consider a replica
> operating in refreshOnly mode, polling once every few minutes - between
> refreshes, it's out of date anyway. When it pulls down a refresh, it will be
> receiving entries one at a time; they could represent completed multi-entry
> transactions or not, and any client querying the replica will see in-between
> state during that refresh.

Yes its prone to dirty reads. There might be a way to work around this
and actually obtain proper isolation. However it still requires
transaction awareness in replication. Let me try to explain below.

At one point, I investigated writing an optimistic local transaction
manager with MVCC right above partitions (analogous to OpenLDAP
backends). This way all partitions gain the MVCC capability without
having to implement it themselves.

With MVCC you gain a versioned DIB lending itself to better isolation.
Incidentally it has some other positive advantages when combined with
a long term change log, like snapshotting. But once a correlation is
established between the version numbers, transaction identifiers, and
change sequence numbers, (some of which may be combined as the same
number/id) then you can obtain complete local transaction isolation.
All writes are applied to the transaction log until the transaction
commits.

With respect to vanilla syncrepl, this has some implications. The
server polling (say A) for changes from another server (say B), will
not see intermediate updates during the course of a transaction.
There will be no dirty reads from A->B. However a client C reading
from A, can still encounter dirty reads. This is because server A is
not aware the changes being pulled down from B must by applied in a
transaction.

> You could conceivably try to make refreshAndPersist transactional - during
> the persist phase, you can send along the transaction controls with the
> entry updates. Since basically the slapd implementation is to queue up all
> operations of the transaction until the final Commit is received, and then
> write all at once, this will impose some noticeable latency between the
> provider and the consumer. The Consumer would then do the same, queuing up
> all the received writes until it also receives the Commit message from the
> provider. So assuming perfect networks and perfect hardware and software,
> you could propagate the transactional state down the line. The changes would

And this gives us the transaction boundary we need to write changes to
the server receiving replication updates. Presume A receives updates
this way and changes are persisted within transactions, then  client C
while reading from A would not encounter dirty reads.

> become visible atomically, but at staggered intervals relative to the
> execution time on the original server. But if any network link is broken,
> the consumer has to catch up again by using a refresh phase, and the
> transactional consistency is lost during that time.

With fully isolated local transactions, that rollback on failure,
won't we be safe in this refresh to catchup situation?

> I guess for delta syncrepl, since we record changes in a log and play them
> back in a known order, we could preserve the transactional state as well.

Won't you still need to isolate changes during the course of applying
a transaction while replaying from the log until the point where the
commit completes?

I perused the syncrepl spec but I'm way unprepared to rationalize over
how to change it. I really appreciate your clarifications here.

Right now we're cleaning up house trying to get out an ApacheDS 2.0
out the door. Once 2.0 GA is out, I want to fully implement an MVCC
transaction manager and at this point figure out how that cooperates
properly with syncrepl.

I want to make sure we speak the same language and a hopefully a
heterogeneous OpenLDAP/ApacheDS cluster can be achieved.

Thanks,
Alex

Mime
View raw message