directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Karasulu" <akaras...@apache.org>
Subject Re: [ApacheDS][Mitosis] Replication data
Date Thu, 06 Dec 2007 01:59:25 GMT
Hi Martin,

On Dec 5, 2007 7:00 PM, Martin Alderson <equim@planetquake.com> wrote:

>
> Thanks for the responses, all.
>
> Apologies for the delay in getting back to you - having a family problem
> at the moment so have very little spare time.
>

No problem hope all is well.


> I thought having the replication logs stored in LDAP sounded nice - for
> new replicas we have to send all replicatable entries but after that the
> log LDAP entries can be sent instead.  It would be pretty much the same
> code logic and it just seemed to solve all the problems with a large
> amount of code re-use.  I was worried about possible performance hits
> though and it sounds like you (Alex) don't want to store the logs in
> LDAP for the same reason.
>

I probably did not express myself well enough the last time.  I'm 100% for
accessing the replication logs via LDAP and that's why we would wrap this
store with a Partition.  I think by "stored in LDAP" you mean storing the
logs in the JDBM partition implementation right?

We can store it in there and it will still be efficient but not as efficient
as building a custom store for it.  We could at this point customize the
JDBM store so it performs pretty close to a custom implementation.  However
consider the overheads that are built in like dealing with aliases etc and a
full fledged search capability.  There will be a latency cost for that.  But
for starters why not just use the JDBM partition and we can optimize
later.   I may also be over doing it here so please feel free to slap some
reality into me.

Really though a custom partition would not be that hard to implement
especially if we reused some existing code from the jdbm partition.


> My main reasons for suggesting storing the logs in LDAP are:
> 1. So we can have optional attributes in each log entry.  This is needed
> when we "explode" the current message blob so it can be queried
> efficiently.  With JDBM I guess we would have to specify a new table for
> each type of message.


Oh I see you want to query the log looking for specific attributes by name?


> 2. To reduce the code complexity.  We would have virtually the same code
> for sending whole entries as sending the logs and we would have less
> code for dealing with the data storage in general.
> 3. To reduce the current tight coupling with the backend database.  By
> using LDAP as the abstraction layer we could leverage ApacheDS' existing
> mechanism for specifying the data store.
> 4. To allow an easy way to view the logs.
> 5. It seems to be the most natural fit.  Since we need to store (part
> of) an LDAP entry in the logs, why not store it in LDAP?
>
> I'll take another stab at explaining that: we already have code to store
> LDAP entries in a database, so why would we want to write that again?
>

Sorry I feel I'm misunderstanding you :(.

Are you suggesting using the directoryService handle you get in the
ReplicationInterceptor.init() method to perform log store operations against
replication log entries in the DIT?

I guess we can do that sure. I'm guessing you want to use the
directoryService and get a JNDI context to use in your ReplicationStore
implementation.  So your going to define a replication log schema, implement
an JNDI based ReplicationStore implementation?

Oh BTW the reasons why I wanted to write a custom store was because the rep
log store is simple and requires primitive searches.  I did not want to add
code reentering the interceptor chain again as well for writes but we have
bypass operations to ignore replication.  I was thinking the store can then
be exposed as a read-only partition if we write a simple partition wrapper
for it.


>
>  > Oh this reminds me that we also need to make sure we're generating
>  > UUIDs all the time even if replication is not enabled.
>
> Yeah, we have a JIRA about this:
> https://issues.apache.org/jira/browse/DIRSERVER-776
>
>
Yeah a while back.  Still not resolved and the tombstoning is creating
complications.

...


>  > For example you have a delete of a node occur right when you add a
>  > child to it.  The server would probably put the child into some
>  > lost and found area and alert the administrator.  With tombstoning
>  > you can easily resuscitate the deleted parent and move the child
>  > back under it.
>
> Resuscitating a deleted entry seems like something most people wouldn't
> want.


Maybe I don't know.


> If we are attempting to simulate a single server as much as
> possible (which is my main aim)


Yeah that should be our main goal as is cited several times in the
literature.


> then the new child entry should be
> deleted when the peers synchronise.  As you said, we could have an
> optional lost and found area for cases where conflict resolution causes
> data loss like this, along with optional notifications to an
> administrator.
>

Exactly that's the only reason for this lost and found area: to recover data
which is lost due to automatic conflict resolution.


>
>  >> Also our MMR support is still immature, we don't yet do value-level
>  >> conflict resolution.
>  > Yeash we have yet to consider that.
>
> We will have this once I have fixed
> https://issues.apache.org/jira/browse/DIRSERVER-894.
>
>
Excellent.

Alex

Mime
View raw message