couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: data recovery tool progress
Date Tue, 10 Aug 2010 06:26:51 GMT
With Randall's help we hooked the new node scanner up to the lost+found DB generator.  It seems
to work well enough for small DBs; for large DBs with lots of missing nodes the O(N^2) complexity
of the problem catches up to the code and generating the lost+found DB takes quite some time.
 Mikeal is running tests tonight.  The algo appears pretty CPU-limited, so a little parallelization
may be warranted.

http://github.com/kocolosk/couchdb/tree/db_repair

Adam

(I sent this previous update to myself instead of the list, so I'll forward it here ...)

On Aug 10, 2010, at 12:01 AM, Adam Kocoloski wrote:

> On Aug 9, 2010, at 10:10 PM, Adam Kocoloski wrote:
> 
>> Right, make_lost_and_found still relies on code which reads through couch_file one
byte at a time, that's the cause of the slowness.  The newer scanner will improve that pretty
dramatically, and we can tune it further by increasing the length of the pattern that we match
when looking for kp/kv_node terms in the files, at the expense of some extra complexity dealing
with the block prefixes (currently it does a 1-byte match, which as I understand it cannot
be split across blocks).
> 
> The scanner now looks for a 7 byte match, unless it is within 6 bytes of a block boundary,
in which case it looks for the longest possible match at that position.  The more specific
match condition greatly reduces the # of calls to couch_file, and thus boosts the throughput.
 On my laptop it can scan the testwritesdb.couch from Mikeal's couchtest repo (52 MB) in 18
seconds.
> 
>> Regarding the file_corruption error on the larger file, I think this is something
we will just naturally trigger when we take a guess that random positions in a file are actually
the beginning of a term.  I think our best recourse here is to return {error, file_corruption}
from couch_file but leave the gen_server up and running instead of terminating it.  That way
the repair code can ignore the error and keep moving without having to reopen the file.
> 
> I committed this change (to my db_repair branch) after consulting with Chris.  The longer
match condition makes these spurious file_corruption triggers much less likely, but I think
it's still a good thing not to crash the server when they happen.
> 
>> Next steps as I understand them - Randall is working on integrating the in-memory
scanner into Volker's code that finds all the dangling by_id nodes.  I'm working on making
sure that the scanner identifies bt node candidates which span block prefixes, and on improving
its pattern-matching.
> 
> Latest from my end
> http://github.com/kocolosk/couchdb/tree/db_repair
> 
>> 
>> Adam
>> 
>> On Aug 9, 2010, at 9:50 PM, Mikeal Rogers wrote:
>> 
>>> I pulled down the latest code from Adam's branch @
>>> 7080ff72baa329cf6c4be2a79e71a41f744ed93b.
>>> 
>>> Running timer:tc(couch_db_repair, make_lost_and_found, ["multi_conflict"]).
>>> on a database with 200 lost updates spanning 200 restarts (
>>> http://github.com/mikeal/couchtest/blob/master/multi_conflict.couch ) took
>>> about 101 seconds.
>>> 
>>> I tried running against a larger databases (
>>> http://github.com/mikeal/couchtest/blob/master/testwritesdb.couch ) and I
>>> got this exception:
>>> 
>>> http://gist.github.com/516491
>>> 
>>> -Mikeal
>>> 
>>> 
>>> 
>>> On Mon, Aug 9, 2010 at 6:09 PM, Randall Leeds <randall.leeds@gmail.com>wrote:
>>> 
>>>> Summing up what went on in IRC for those who were absent.
>>>> 
>>>> The latest progress is on Adam's branch at
>>>> http://github.com/kocolosk/couchdb/tree/db_repair
>>>> 
>>>> couch_db_repair:make_lost_and_found/1 attempts to create a new
>>>> lost+found/DbName database to which it merges all nodes not accessible
>>>> from anywhere (any other node found in a full file scan or any header
>>>> pointers).
>>>> 
>>>> Currently, make_lost_and_found uses Volker's repair (from
>>>> couch_db_repair_b module, also in Adam's branch).
>>>> Adam found that the bottleneck was couch_file calls and that the
>>>> repair process was taking a very long time so he added
>>>> couch_db_repair:find_nodes_quickly/1 that reads 1MB chunks as binary
>>>> and tries to process it to find nodes instead of scanning back one
>>>> byte at a time. It is currently not hooked up to the repair mechanism.
>>>> 
>>>> Making progress. Go team.
>>>> 
>>>> On Mon, Aug 9, 2010 at 13:52, Mikeal Rogers <mikeal.rogers@gmail.com>
>>>> wrote:
>>>>> jchris suggested on IRC that I try a normal doc update and see if that
>>>> fixes
>>>>> it.
>>>>> 
>>>>> It does. After a new doc was created the dbinfo doc count was back to
>>>>> normal.
>>>>> 
>>>>> -Mikeal
>>>>> 
>>>>> On Mon, Aug 9, 2010 at 1:39 PM, Mikeal Rogers <mikeal.rogers@gmail.com
>>>>> wrote:
>>>>> 
>>>>>> Ok, I pulled down this code and tested against a database with a
ton of
>>>>>> missing writes right before a single restart.
>>>>>> 
>>>>>> Before restart this was the database:
>>>>>> 
>>>>>> {
>>>>>> db_name: "testwritesdb"
>>>>>> doc_count: 124969
>>>>>> doc_del_count: 0
>>>>>> update_seq: 124969
>>>>>> purge_seq: 0
>>>>>> compact_running: false
>>>>>> disk_size: 54857478
>>>>>> instance_start_time: "1281384140058211"
>>>>>> disk_format_version: 5
>>>>>> }
>>>>>> 
>>>>>> After restart it was this:
>>>>>> 
>>>>>> {
>>>>>> db_name: "testwritesdb"
>>>>>> doc_count: 1
>>>>>> doc_del_count: 0
>>>>>> update_seq: 1
>>>>>> purge_seq: 0
>>>>>> compact_running: false
>>>>>> disk_size: 54857478
>>>>>> instance_start_time: "1281384593876026"
>>>>>> disk_format_version: 5
>>>>>> }
>>>>>> 
>>>>>> After repair, it's this:
>>>>>> 
>>>>>> {
>>>>>> db_name: "testwritesdb"
>>>>>> doc_count: 1
>>>>>> doc_del_count: 0
>>>>>> update_seq: 124969
>>>>>> purge_seq: 0
>>>>>> compact_running: false
>>>>>> disk_size: 54857820
>>>>>> instance_start_time: "1281385990193289"
>>>>>> disk_format_version: 5
>>>>>> committed_update_seq: 124969
>>>>>> }
>>>>>> 
>>>>>> All the sequences are there and hitting _all_docs shows all the
>>>> documents
>>>>>> so why is the doc_count only 1 in the dbinfo?
>>>>>> 
>>>>>> -Mikeal
>>>>>> 
>>>>>> On Mon, Aug 9, 2010 at 11:53 AM, Filipe David Manana <
>>>> fdmanana@apache.org>wrote:
>>>>>> 
>>>>>>> For the record (and people not on IRC), the code at:
>>>>>>> 
>>>>>>> http://github.com/fdmanana/couchdb/commits/db_repair
>>>>>>> 
>>>>>>> is working for at least simple cases. Use
>>>>>>> couch_db_repair:repair(DbNameAsString).
>>>>>>> There's one TODO:  update the reduce values for the by_seq and
by_id
>>>>>>> BTrees.
>>>>>>> 
>>>>>>> If anyone wants to give some help on this, your welcome.
>>>>>>> 
>>>>>>> On Mon, Aug 9, 2010 at 6:12 PM, Mikeal Rogers <mikeal.rogers@gmail.com
>>>>>>>> wrote:
>>>>>>> 
>>>>>>>> I'm starting to create a bunch of test db files that expose
this bug
>>>>>>> under
>>>>>>>> different conditions like multiple restarts, across compaction,
>>>>>>> variances
>>>>>>>> in
>>>>>>>> updates the might cause conflict, etc.
>>>>>>>> 
>>>>>>>> http://github.com/mikeal/couchtest
>>>>>>>> 
>>>>>>>> The README outlines what was done to the db's and what needs
to be
>>>>>>>> recovered.
>>>>>>>> 
>>>>>>>> -Mikeal
>>>>>>>> 
>>>>>>>> On Mon, Aug 9, 2010 at 9:33 AM, Filipe David Manana <
>>>>>>> fdmanana@apache.org
>>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> On Mon, Aug 9, 2010 at 5:22 PM, Robert Newson <
>>>>>>> robert.newson@gmail.com
>>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Doesn't this bit;
>>>>>>>>>> 
>>>>>>>>>> -        Db#db{waiting_delayed_commit=nil};
>>>>>>>>>> +        Db;
>>>>>>>>>> +        % Db#db{waiting_delayed_commit=nil};
>>>>>>>>>> 
>>>>>>>>>> revert the bug fix?
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> That's intentional, for my local testing.
>>>>>>>>> That patch isn't obviously anything close to final, it's
too
>>>>>>> experimental
>>>>>>>>> yet.
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> B.
>>>>>>>>>> 
>>>>>>>>>> On Mon, Aug 9, 2010 at 5:09 PM, Jan Lehnardt <jan@apache.org>
>>>>>>> wrote:
>>>>>>>>>>> Hi All,
>>>>>>>>>>> 
>>>>>>>>>>> Filipe jumped in to start working on the recovery
tool, but he
>>>>>>> isn't
>>>>>>>>> done
>>>>>>>>>> yet.
>>>>>>>>>>> 
>>>>>>>>>>> Here's the current patch:
>>>>>>>>>>> 
>>>>>>>>>>> http://www.friendpaste.com/4uMngrym4r7Zz4R0ThSHbz
>>>>>>>>>>> 
>>>>>>>>>>> it is not done and very early, but any help on
this is greatly
>>>>>>>>>> appreciated.
>>>>>>>>>>> 
>>>>>>>>>>> The current state is (in Filipe's words):
>>>>>>>>>>> - i can detect that a file needs repair
>>>>>>>>>>> - and get the last btree roots from it
>>>>>>>>>>> - "only" missing: get last db seq num
>>>>>>>>>>> - write new header
>>>>>>>>>>> - and deal with the local docs btree (if exists)
>>>>>>>>>>> 
>>>>>>>>>>> Thanks!
>>>>>>>>>>> Jan
>>>>>>>>>>> --
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Filipe David Manana,
>>>>>>>>> fdmanana@apache.org
>>>>>>>>> 
>>>>>>>>> "Reasonable men adapt themselves to the world.
>>>>>>>>> Unreasonable men adapt the world to themselves.
>>>>>>>>> That's why all progress depends on unreasonable men."
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Filipe David Manana,
>>>>>>> fdmanana@apache.org
>>>>>>> 
>>>>>>> "Reasonable men adapt themselves to the world.
>>>>>>> Unreasonable men adapt the world to themselves.
>>>>>>> That's why all progress depends on unreasonable men."
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> 
> 


Mime
View raw message