directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lecharny <elecha...@apache.org>
Subject Re: [DRS] thoughts about implementation
Date Mon, 02 Feb 2009 10:09:52 GMT
Hi,

On Sun, Feb 1, 2009 at 7:49 PM, Kiran Ayyagari <ayyagarikiran@gmail.com> wrote:
> Hello guys,
>
>    Here is an initial idea about implementation which I have in my mind
>
>    HOWL has a feature called 'marking' in its log file (a.k.a journal). The
> idea is to use this as a checkpoint since
>    the last successful disk write of the DIT data i.e whenever we perform a
> sync we put a mark in the journal.
>    in case of a crash we can retrieve the data from journal from the marked
> position(using howl API),

This is the only way we can restore not corrupted data : everything
before a checkpoint is correct, everything after a checkpoint is
assumed to be potentially harmed.

>
>    Currently the syncing of DIT data is a per partition based operation
> unless a call is made to
>    DirectoryService's sync() method which internally calls the sync on
> PartitionNexus.
>
>    IMO this marking of journal should happen in the DirectoryService's
> sync() operation.

As the sync can occur on a quite big interval of time (let say 15 secs
by default), if we depend on it to add a checkpoint, that mean we may
lose 15 secs of modifications. It may be seen as accepteblae, but
IMHO, the idea is to store logs as fast as possible, and when a
modification is considered as done, then add a checkpoint. This will
limitate the potential loss of information.

>
>    A change to the partition's sync() method to call DirectoryService's
> sync() which intern calls (each) partition's
>    commit() (a new method) would help. A special flag like 'isDirty' in the
> partition will allow us to avoid calling
>    commit() on each partition.
>
>    Any other ideas about how best we can maintain a checkpoint/mark after
> *any* sync operation on DIT data?.

when the mod is considered done (even if not written into the
backend), we should consider the operation valid, and then write it on
the log, and when done, adding a checkpoint. That should be done for
every modification, whatever it costs.

>
>    Having said this, I have another issue, how do I detect the beginning of
> a corrupted
>    entry in a JDBM file(all the DIT data is stored in these files)
>
>    To put this in other way, if a JDBM file was synced at nth entry and
> server was crashed in the middle of
>    writing n+1th entry I would like to start recovery from the end of nth
> record (a common idea I believe though)
>    (haven't looked at jdbm code yet, but throwing this question anticipating
> a quick answer ;) )

Not an easy question. If you think about BDB, they are using a journal
which is used for what they call a 'catastrophic' recovery. Basically
what we need. If we store information into a JDBM database, then
writting into it might corrupt the database. What save us is that we
can stop the writting into this base for a while, saving a copy of it,
and restarting the operations.

IMO, JDBM should be used to help access to modifications operation,
but the master table should not contain the real data, but just a
pointer to another file in which modification are written in a
sequential way. We just keep an offset into the Mastertable. If
something goes really wrong, we can rebuild the master table and all
the indexes from this sequential file.

This can be discussed, I'm just speaking my mind here, not imposing
any solution.

-- 
Regards,
Cordialement,
Emmanuel L├ęcharny
www.iktek.com

Mime
View raw message