Hi Kiran,

On Sun, Feb 1, 2009 at 1:49 PM, Kiran Ayyagari <ayyagarikiran@gmail.com> wrote:
Hello guys,

   Here is an initial idea about implementation which I have in my mind

   HOWL has a feature called 'marking' in its log file (a.k.a journal). The idea is to use this as a checkpoint since
   the last successful disk write of the DIT data i.e whenever we perform a sync we put a mark in the journal.
   in case of a crash we can retrieve the data from journal from the marked position(using howl API),

   Currently the syncing of DIT data is a per partition based operation unless a call is made to
   DirectoryService's sync() method which internally calls the sync on PartitionNexus.

   IMO this marking of journal should happen in the DirectoryService's sync() operation.

   A change to the partition's sync() method to call DirectoryService's sync() which intern calls (each) partition's
   commit() (a new method) would help. A special flag like 'isDirty' in the partition will allow us to avoid calling
   commit() on each partition.

   Any other ideas about how best we can maintain a checkpoint/mark after *any* sync operation on DIT data?.

   Having said this, I have another issue, how do I detect the beginning of a corrupted
   entry in a JDBM file(all the DIT data is stored in these files)

The problem with JDBM file corruption is that you loose everything. I don't think the dot.db file is recoverable and needs to be rebuilt.  From my impressions from user issues due to corruption and past experiences when the file is corrupt the whole file is lost.  It's not a single record in the db file that is bad.  So the entire file needs to be reconstructed.

If the file is an index this is recoverable.  If it's the master.db then we have a serious disaster.  In this case the entire changelog must be used to rebuild the master.

   To put this in other way, if a JDBM file was synced at nth entry and server was crashed in the middle of
   writing n+1th entry I would like to start recovery from the end of nth record (a common idea I believe though)
   (haven't looked at jdbm code yet, but throwing this question anticipating a quick answer ;) )

Again like I said it's not this simple. I think JDBM API's start to fail overall on corruption depending on how the corruption impacts accessing the BTree. One bad access can cause access to half the entries to fail.

I think you're idea would work very well if the journal was well integrated with JDBM at the lowest level.