From dev-return-28743-apmail-directory-dev-archive=directory.apache.org@directory.apache.org Mon Feb 02 12:18:44 2009 Return-Path: Delivered-To: apmail-directory-dev-archive@www.apache.org Received: (qmail 24530 invoked from network); 2 Feb 2009 12:18:44 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 2 Feb 2009 12:18:44 -0000 Received: (qmail 56246 invoked by uid 500); 2 Feb 2009 12:18:43 -0000 Delivered-To: apmail-directory-dev-archive@directory.apache.org Received: (qmail 56206 invoked by uid 500); 2 Feb 2009 12:18:43 -0000 Mailing-List: contact dev-help@directory.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Apache Directory Developers List" Delivered-To: mailing list dev@directory.apache.org Received: (qmail 56197 invoked by uid 99); 2 Feb 2009 12:18:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Feb 2009 04:18:43 -0800 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of akarasulu@gmail.com designates 74.125.44.30 as permitted sender) Received: from [74.125.44.30] (HELO yx-out-2324.google.com) (74.125.44.30) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Feb 2009 12:18:36 +0000 Received: by yx-out-2324.google.com with SMTP id 8so478667yxb.55 for ; Mon, 02 Feb 2009 04:18:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=FdBwOqBMxmgd/87+XnzSDTjnjEYLw5ILy66lIhovAhI=; b=leY1/Uv8ltBvJpkUaNHO1rhEq1l6aikSry50o8RttZ+RPGovAZJOpXrbF8MsyKEuMQ tuREi0ZTNJwH4QJo/w/ApWCJDvz4Qkbm4ZCPFTCdRloRKodBllAwm7QkI5v5i4l1Uuoe 0bENKj/xfAq2s+NApETXOycZzDfGOlMih89Uc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=Zim0+BGB6fZ+78qs0oDlVUb35otn4ADvE21Bl2C1ZS4ws9uOZRi8oip6L+Qq3Jc+JV RfpxhrstsFZg5vfZPfcoqOlUYGfsb1oyww32vStqXxNVAv51LszoNMQXbBfGWK80dO72 Y1OPtV72T0a0/8YDSswM26Capg1N2yIMvq30M= MIME-Version: 1.0 Received: by 10.231.10.194 with SMTP id q2mr499841ibq.0.1233577095276; Mon, 02 Feb 2009 04:18:15 -0800 (PST) In-Reply-To: <4985EEB3.7070406@gmail.com> References: <4985EEB3.7070406@gmail.com> Date: Mon, 2 Feb 2009 07:18:15 -0500 Message-ID: Subject: Re: [DRS] thoughts about implementation From: Alex Karasulu To: Apache Directory Developers List Content-Type: multipart/alternative; boundary=00032557454a9128fe0461ee8c13 X-Virus-Checked: Checked by ClamAV on apache.org --00032557454a9128fe0461ee8c13 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hi Kiran, On Sun, Feb 1, 2009 at 1:49 PM, Kiran Ayyagari wrote: > Hello guys, > > Here is an initial idea about implementation which I have in my mind > > HOWL has a feature called 'marking' in its log file (a.k.a journal). The > idea is to use this as a checkpoint since > the last successful disk write of the DIT data i.e whenever we perform a > sync we put a mark in the journal. > in case of a crash we can retrieve the data from journal from the marked > position(using howl API), > > Currently the syncing of DIT data is a per partition based operation > unless a call is made to > DirectoryService's sync() method which internally calls the sync on > PartitionNexus. > > IMO this marking of journal should happen in the DirectoryService's > sync() operation. > > A change to the partition's sync() method to call DirectoryService's > sync() which intern calls (each) partition's > commit() (a new method) would help. A special flag like 'isDirty' in the > partition will allow us to avoid calling > commit() on each partition. > > Any other ideas about how best we can maintain a checkpoint/mark after > *any* sync operation on DIT data?. > > Having said this, I have another issue, how do I detect the beginning of > a corrupted > entry in a JDBM file(all the DIT data is stored in these files) > The problem with JDBM file corruption is that you loose everything. I don't think the dot.db file is recoverable and needs to be rebuilt. From my impressions from user issues due to corruption and past experiences when the file is corrupt the whole file is lost. It's not a single record in the db file that is bad. So the entire file needs to be reconstructed. If the file is an index this is recoverable. If it's the master.db then we have a serious disaster. In this case the entire changelog must be used to rebuild the master. > > To put this in other way, if a JDBM file was synced at nth entry and > server was crashed in the middle of > writing n+1th entry I would like to start recovery from the end of nth > record (a common idea I believe though) > (haven't looked at jdbm code yet, but throwing this question > anticipating a quick answer ;) ) > Again like I said it's not this simple. I think JDBM API's start to fail overall on corruption depending on how the corruption impacts accessing the BTree. One bad access can cause access to half the entries to fail. I think you're idea would work very well if the journal was well integrated with JDBM at the lowest level. Regards, Alex --00032557454a9128fe0461ee8c13 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Kiran,

On Sun, Feb 1, 2009 at 1:49 PM,= Kiran Ayyagari <ayyagarikiran@gmail.com> wrote:
Hello guys,

   Here is an initial idea about implementation which I have in = my mind

   HOWL has a feature called 'marking' in its log file (= a.k.a journal). The idea is to use this as a checkpoint since
   the last successful disk write of the DIT data i.e whenever w= e perform a sync we put a mark in the journal.
   in case of a crash we can retrieve the data from journal from= the marked position(using howl API),

   Currently the syncing of DIT data is a per partition based op= eration unless a call is made to
   DirectoryService's sync() method which internally calls t= he sync on PartitionNexus.

   IMO this marking of journal should happen in the DirectorySer= vice's sync() operation.

   A change to the partition's sync() method to call Directo= ryService's sync() which intern calls (each) partition's
   commit() (a new method) would help. A special flag like '= isDirty' in the partition will allow us to avoid calling
   commit() on each partition.

   Any other ideas about how best we can maintain a checkpoint/m= ark after *any* sync operation on DIT data?.

   Having said this, I have another issue, how do I detect the b= eginning of a corrupted
   entry in a JDBM file(all the DIT data is stored in these file= s)

The problem with JDBM file corruption is that you loo= se everything. I don't think the dot.db file is recoverable and needs t= o be rebuilt.  From my impressions from user issues due to corruption = and past experiences when the file is corrupt the whole file is lost. = It's not a single record in the db file that is bad.  So the enti= re file needs to be reconstructed.

If the file is an index this is recoverable.  If it's the mast= er.db then we have a serious disaster.  In this case the entire change= log must be used to rebuild the master.
 

   To put this in other way, if a JDBM file was synced at nth en= try and server was crashed in the middle of
   writing n+1th entry I would like to start recovery from the e= nd of nth record (a common idea I believe though)
   (haven't looked at jdbm code yet, but throwing this quest= ion anticipating a quick answer ;) )

Again like I said it's not this simple. I think = JDBM API's start to fail overall on corruption depending on how the cor= ruption impacts accessing the BTree. One bad access can cause access to hal= f the entries to fail.

I think you're idea would work very well if the journal was well in= tegrated with JDBM at the lowest level.

Regards,
Alex

--00032557454a9128fe0461ee8c13--