Mailing-List: contact derby-user-help@db.apache.org; run by ezmlm
Precedence: bulk
Reply-To: "Derby Discussion" <derby-user@db.apache.org>
Received-SPF: neutral (nike.apache.org: local policy)
Message-ID: <4A92A3DC.9060602@baxter-it.com>
Date: Mon, 24 Aug 2009 16:29:48 +0200
From: Gabor 'Morc' KORMOS <morc@baxter-it.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
 rv:1.8.1.22) Gecko/20090605 SeaMonkey/1.1.17
MIME-Version: 1.0
To: Derby Discussion <derby-user@db.apache.org>
Subject: Re: Corrupted database
References: <4A38BD56.5060307@baxter-it.com> <4A38FF73.4010902@sbcglobal.net>
 <4A390503.3030207@baxter-it.com> <4A3933DA.20404@sbcglobal.net>
 <4A39358F.3080602@baxter-it.com> <4A3F7D98.8000306@baxter-it.com>
 <4A3FB47C.8080808@amberpoint.com> <4A401588.4020608@sbcglobal.net>
 <4A409BBA.8090800@baxter-it.com>
In-Reply-To: <4A409BBA.8090800@baxter-it.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

  As said here's the update on this. I tried to fill up the datafile to 
the same size as the corrupted file, but it turned out that records are 
deleted from that table. Anyhow after copying page 0 the corrupted file 
sprung back to life and there were records in the table although I have 
no idea if they were the records which were active before the corruption 
or contained deleted records too. Anyhow it was not necessary to recover 
the data, so I did not spend more time on trying. I'm sure with 
relatively little effort and by studying page/file structure available 
on the website one could reconstruct page 0 and recover the data.

  Regards,

  Morc.

On 23/06/2009 11:09, Gabor 'Morc' KORMOS wrote:
>  Hi Mike,
>
>  Brilliant idea! I'll give it a try and I appreciate your help very 
> much! Although there's a problem, that is I don't know how many 
> records there are in the table, but I'll insert until I reach the same 
> file size. If I understand Sonar correctly this table is never deleted 
> from or updated, just inserted and queried. I let you know how it goes 
> and also for others to have an answer whether this solution works or not.
>
>  Regards,
>
>  Morc.
>
> On 23/06/2009 01:36, Mike Matrigali wrote:
>> Along these lines, if you don't want to hack the code I would try the
>> following (note I have not ever tried this, so have no idea if it will
>> work - I thought about this for awhile and looked at the headers and 
>> could not come up with why it would not work).  For this you will 
>> need to know the seg0 file associated with
>> the table you are trying to recover and the seg0 file you will create
>> for a new dummy table.
>>
>> o Shutdown the db you are working with cleanly with shutdown=true, make
>>   an offline copy of it and only work on the copy.
>> o In another db create a new table that has the same ddl as the original
>>   table - i will call this dummy table.  The most important part of
>>   this is that the new table has the
>>   same page size as the table you are trying to recover.  If you use 
>> the same ddl and you didn't set any page size properties when you 
>> created the original table then the page size should match.  The size 
>> and
>>   structure of the allocation part of page 0 is different for each of 
>> the 4 supported page sizes (2k, 4k, 8k, 32k) - basically there is a 
>> fixed header and "the rest" is used for the allocation page.  Thus the
>> bigger the page size the more pages the allocation bit map on page 0
>> controls.
>> o now insert as many rows as necessary into the dummy table such that 
>> it is as big as the table you are trying to recover.  The goal here 
>> is to
>> get the allocation map in page 0 to mark all the pages as allocated and
>> in use.
>> o now shutdown cleanly (ie. shutdown=true) - if you don't do this then
>>   the changes may only be in the log and not in the seg0 file.
>> o now with both db's shutdown cleanly, copy page 0 from the dummy file
>> over page 0 of the copied database table that you are trying to recover
>> and try booting, and checking the table.
>> On a unix system I think this can easily be done with one or two dd 
>> commands, let me know if you need more info.  Again do this only while
>> db's are shutdown cleanly otherwise all sorts of recovery problems may
>> happen.
>> o And of course after you do this you should run the consistency 
>> checker to see what else may be corrupted, note the consistency 
>> checker does not
>> check everything.  So I usually recommend the only safe thing to do when
>> using this kind of data corruption mining is to select the recovered 
>> data out of the bad db and then insert into a newly created good db 
>> otherwise the corruptions may lurk around and bite you later.  The 
>> checker is not perfect, it mostly does a good job of checking that the
>> index tree's are consistent internally and that they are consistent with
>> the base tables.  For instance I don't think it even reads data in base
>> that is not needed to check the indexes.
>>
>> /mikem
>>
>> Bryan Pendleton wrote:
>>>>  Anyone adventurous enough to help me with this problem? Even a 
>>>> yes/no to the question whether it's possible and maybe some 
>>>> pointers how to reconstruct page 0 would be a lot of help I think.
>>>
>>> Well, anything's possible, since it's open source software and so 
>>> you can
>>> change it and make it do what you need. However, this doesn't sound 
>>> like
>>> a very easy thing to do. You've definitely exhausted all possible 
>>> sources
>>> of backups?
>>>
>>> Here's some fairly high-level information about page formats:
>>> http://db.apache.org/derby/papers/pageformats.html
>>>
>>> My first thought would be that if you could make page 0 appear to look
>>> as though ALL other pages in the conglomerate were "in-use", so that it
>>> seemed to have no pages marked "available", then you could try opening
>>> the database and reading all the data out.
>>>
>>> So you'd like the FreePages bitmap to be empty, for this recovery 
>>> scenario.
>>>
>>> thanks,
>>>
>>> bryan