incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <robert.new...@gmail.com>
Subject Re: data recovery tool progress
Date Tue, 10 Aug 2010 08:55:35 GMT
In ran the db_repair code on a healthy database produced with
delayed_commits=true.

The source db had 3218 docs. db_repair recovered 3120 and then returned with ok.

I'm redoing that test, but this indicates we're not finding all roots.

I note that the output file was 36 times the input file, which is a
consequence of folding all possible roots. I think that needs to be in
the release notes for the repair tool if that behavior remains when it
ships.

B.

On Tue, Aug 10, 2010 at 9:09 AM, Mikeal Rogers <mikeal.rogers@gmail.com> wrote:
> I think I found a bug in the current lost+found repair.
>
> I've been running it against the testwritesdb and it's in a state that is
> never finishing.
>
> It's still spitting out these lines:
>
> [info] [<0.32.0>] writing 1001 updates to lost+found/testwritesdb
>
> Most are 1001 but there are also other random variances 452, 866, etc.
>
> But the file size and dbinfo hasn't budged in over 30 minutes. The size is
> stuck at 34300002 with the original db file being 54857478 .
>
> This database only has one document in it that isn't "lost" so if it's
> finding *any* new docs it should be writing them.
>
> I also started another job to recover a production db that is quite large,
> 500megs, with the missing data a week or so back. This has been running for
> 2 hours and has still not output anything or created the lost and found db
> so I can only assume that it is in the same state.
>
> Both machines are still churning 100% CPU.
>
> -Mikeal
>
>
> On Mon, Aug 9, 2010 at 11:26 PM, Adam Kocoloski <kocolosk@apache.org> wrote:
>
>> With Randall's help we hooked the new node scanner up to the lost+found DB
>> generator.  It seems to work well enough for small DBs; for large DBs with
>> lots of missing nodes the O(N^2) complexity of the problem catches up to the
>> code and generating the lost+found DB takes quite some time.  Mikeal is
>> running tests tonight.  The algo appears pretty CPU-limited, so a little
>> parallelization may be warranted.
>>
>> http://github.com/kocolosk/couchdb/tree/db_repair
>>
>> Adam
>>
>> (I sent this previous update to myself instead of the list, so I'll forward
>> it here ...)
>>
>> On Aug 10, 2010, at 12:01 AM, Adam Kocoloski wrote:
>>
>> > On Aug 9, 2010, at 10:10 PM, Adam Kocoloski wrote:
>> >
>> >> Right, make_lost_and_found still relies on code which reads through
>> couch_file one byte at a time, that's the cause of the slowness.  The newer
>> scanner will improve that pretty dramatically, and we can tune it further by
>> increasing the length of the pattern that we match when looking for
>> kp/kv_node terms in the files, at the expense of some extra complexity
>> dealing with the block prefixes (currently it does a 1-byte match, which as
>> I understand it cannot be split across blocks).
>> >
>> > The scanner now looks for a 7 byte match, unless it is within 6 bytes of
>> a block boundary, in which case it looks for the longest possible match at
>> that position.  The more specific match condition greatly reduces the # of
>> calls to couch_file, and thus boosts the throughput.  On my laptop it can
>> scan the testwritesdb.couch from Mikeal's couchtest repo (52 MB) in 18
>> seconds.
>> >
>> >> Regarding the file_corruption error on the larger file, I think this is
>> something we will just naturally trigger when we take a guess that random
>> positions in a file are actually the beginning of a term.  I think our best
>> recourse here is to return {error, file_corruption} from couch_file but
>> leave the gen_server up and running instead of terminating it.  That way the
>> repair code can ignore the error and keep moving without having to reopen
>> the file.
>> >
>> > I committed this change (to my db_repair branch) after consulting with
>> Chris.  The longer match condition makes these spurious file_corruption
>> triggers much less likely, but I think it's still a good thing not to crash
>> the server when they happen.
>> >
>> >> Next steps as I understand them - Randall is working on integrating the
>> in-memory scanner into Volker's code that finds all the dangling by_id
>> nodes.  I'm working on making sure that the scanner identifies bt node
>> candidates which span block prefixes, and on improving its pattern-matching.
>> >
>> > Latest from my end
>> > http://github.com/kocolosk/couchdb/tree/db_repair
>> >
>> >>
>> >> Adam
>> >>
>> >> On Aug 9, 2010, at 9:50 PM, Mikeal Rogers wrote:
>> >>
>> >>> I pulled down the latest code from Adam's branch @
>> >>> 7080ff72baa329cf6c4be2a79e71a41f744ed93b.
>> >>>
>> >>> Running timer:tc(couch_db_repair, make_lost_and_found,
>> ["multi_conflict"]).
>> >>> on a database with 200 lost updates spanning 200 restarts (
>> >>> http://github.com/mikeal/couchtest/blob/master/multi_conflict.couch
)
>> took
>> >>> about 101 seconds.
>> >>>
>> >>> I tried running against a larger databases (
>> >>> http://github.com/mikeal/couchtest/blob/master/testwritesdb.couch )
>> and I
>> >>> got this exception:
>> >>>
>> >>> http://gist.github.com/516491
>> >>>
>> >>> -Mikeal
>> >>>
>> >>>
>> >>>
>> >>> On Mon, Aug 9, 2010 at 6:09 PM, Randall Leeds <randall.leeds@gmail.com
>> >wrote:
>> >>>
>> >>>> Summing up what went on in IRC for those who were absent.
>> >>>>
>> >>>> The latest progress is on Adam's branch at
>> >>>> http://github.com/kocolosk/couchdb/tree/db_repair
>> >>>>
>> >>>> couch_db_repair:make_lost_and_found/1 attempts to create a new
>> >>>> lost+found/DbName database to which it merges all nodes not accessible
>> >>>> from anywhere (any other node found in a full file scan or any header
>> >>>> pointers).
>> >>>>
>> >>>> Currently, make_lost_and_found uses Volker's repair (from
>> >>>> couch_db_repair_b module, also in Adam's branch).
>> >>>> Adam found that the bottleneck was couch_file calls and that the
>> >>>> repair process was taking a very long time so he added
>> >>>> couch_db_repair:find_nodes_quickly/1 that reads 1MB chunks as binary
>> >>>> and tries to process it to find nodes instead of scanning back one
>> >>>> byte at a time. It is currently not hooked up to the repair mechanism.
>> >>>>
>> >>>> Making progress. Go team.
>> >>>>
>> >>>> On Mon, Aug 9, 2010 at 13:52, Mikeal Rogers <mikeal.rogers@gmail.com>
>> >>>> wrote:
>> >>>>> jchris suggested on IRC that I try a normal doc update and see
if
>> that
>> >>>> fixes
>> >>>>> it.
>> >>>>>
>> >>>>> It does. After a new doc was created the dbinfo doc count was
back to
>> >>>>> normal.
>> >>>>>
>> >>>>> -Mikeal
>> >>>>>
>> >>>>> On Mon, Aug 9, 2010 at 1:39 PM, Mikeal Rogers <
>> mikeal.rogers@gmail.com
>> >>>>> wrote:
>> >>>>>
>> >>>>>> Ok, I pulled down this code and tested against a database
with a ton
>> of
>> >>>>>> missing writes right before a single restart.
>> >>>>>>
>> >>>>>> Before restart this was the database:
>> >>>>>>
>> >>>>>> {
>> >>>>>> db_name: "testwritesdb"
>> >>>>>> doc_count: 124969
>> >>>>>> doc_del_count: 0
>> >>>>>> update_seq: 124969
>> >>>>>> purge_seq: 0
>> >>>>>> compact_running: false
>> >>>>>> disk_size: 54857478
>> >>>>>> instance_start_time: "1281384140058211"
>> >>>>>> disk_format_version: 5
>> >>>>>> }
>> >>>>>>
>> >>>>>> After restart it was this:
>> >>>>>>
>> >>>>>> {
>> >>>>>> db_name: "testwritesdb"
>> >>>>>> doc_count: 1
>> >>>>>> doc_del_count: 0
>> >>>>>> update_seq: 1
>> >>>>>> purge_seq: 0
>> >>>>>> compact_running: false
>> >>>>>> disk_size: 54857478
>> >>>>>> instance_start_time: "1281384593876026"
>> >>>>>> disk_format_version: 5
>> >>>>>> }
>> >>>>>>
>> >>>>>> After repair, it's this:
>> >>>>>>
>> >>>>>> {
>> >>>>>> db_name: "testwritesdb"
>> >>>>>> doc_count: 1
>> >>>>>> doc_del_count: 0
>> >>>>>> update_seq: 124969
>> >>>>>> purge_seq: 0
>> >>>>>> compact_running: false
>> >>>>>> disk_size: 54857820
>> >>>>>> instance_start_time: "1281385990193289"
>> >>>>>> disk_format_version: 5
>> >>>>>> committed_update_seq: 124969
>> >>>>>> }
>> >>>>>>
>> >>>>>> All the sequences are there and hitting _all_docs shows
all the
>> >>>> documents
>> >>>>>> so why is the doc_count only 1 in the dbinfo?
>> >>>>>>
>> >>>>>> -Mikeal
>> >>>>>>
>> >>>>>> On Mon, Aug 9, 2010 at 11:53 AM, Filipe David Manana <
>> >>>> fdmanana@apache.org>wrote:
>> >>>>>>
>> >>>>>>> For the record (and people not on IRC), the code at:
>> >>>>>>>
>> >>>>>>> http://github.com/fdmanana/couchdb/commits/db_repair
>> >>>>>>>
>> >>>>>>> is working for at least simple cases. Use
>> >>>>>>> couch_db_repair:repair(DbNameAsString).
>> >>>>>>> There's one TODO:  update the reduce values for the
by_seq and
>> by_id
>> >>>>>>> BTrees.
>> >>>>>>>
>> >>>>>>> If anyone wants to give some help on this, your welcome.
>> >>>>>>>
>> >>>>>>> On Mon, Aug 9, 2010 at 6:12 PM, Mikeal Rogers <
>> mikeal.rogers@gmail.com
>> >>>>>>>> wrote:
>> >>>>>>>
>> >>>>>>>> I'm starting to create a bunch of test db files
that expose this
>> bug
>> >>>>>>> under
>> >>>>>>>> different conditions like multiple restarts, across
compaction,
>> >>>>>>> variances
>> >>>>>>>> in
>> >>>>>>>> updates the might cause conflict, etc.
>> >>>>>>>>
>> >>>>>>>> http://github.com/mikeal/couchtest
>> >>>>>>>>
>> >>>>>>>> The README outlines what was done to the db's and
what needs to be
>> >>>>>>>> recovered.
>> >>>>>>>>
>> >>>>>>>> -Mikeal
>> >>>>>>>>
>> >>>>>>>> On Mon, Aug 9, 2010 at 9:33 AM, Filipe David Manana
<
>> >>>>>>> fdmanana@apache.org
>> >>>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> On Mon, Aug 9, 2010 at 5:22 PM, Robert Newson
<
>> >>>>>>> robert.newson@gmail.com
>> >>>>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Doesn't this bit;
>> >>>>>>>>>>
>> >>>>>>>>>> -        Db#db{waiting_delayed_commit=nil};
>> >>>>>>>>>> +        Db;
>> >>>>>>>>>> +        % Db#db{waiting_delayed_commit=nil};
>> >>>>>>>>>>
>> >>>>>>>>>> revert the bug fix?
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> That's intentional, for my local testing.
>> >>>>>>>>> That patch isn't obviously anything close to
final, it's too
>> >>>>>>> experimental
>> >>>>>>>>> yet.
>> >>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> B.
>> >>>>>>>>>>
>> >>>>>>>>>> On Mon, Aug 9, 2010 at 5:09 PM, Jan Lehnardt
<jan@apache.org>
>> >>>>>>> wrote:
>> >>>>>>>>>>> Hi All,
>> >>>>>>>>>>>
>> >>>>>>>>>>> Filipe jumped in to start working on
the recovery tool, but he
>> >>>>>>> isn't
>> >>>>>>>>> done
>> >>>>>>>>>> yet.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Here's the current patch:
>> >>>>>>>>>>>
>> >>>>>>>>>>> http://www.friendpaste.com/4uMngrym4r7Zz4R0ThSHbz
>> >>>>>>>>>>>
>> >>>>>>>>>>> it is not done and very early, but any
help on this is greatly
>> >>>>>>>>>> appreciated.
>> >>>>>>>>>>>
>> >>>>>>>>>>> The current state is (in Filipe's words):
>> >>>>>>>>>>> - i can detect that a file needs repair
>> >>>>>>>>>>> - and get the last btree roots from
it
>> >>>>>>>>>>> - "only" missing: get last db seq num
>> >>>>>>>>>>> - write new header
>> >>>>>>>>>>> - and deal with the local docs btree
(if exists)
>> >>>>>>>>>>>
>> >>>>>>>>>>> Thanks!
>> >>>>>>>>>>> Jan
>> >>>>>>>>>>> --
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> --
>> >>>>>>>>> Filipe David Manana,
>> >>>>>>>>> fdmanana@apache.org
>> >>>>>>>>>
>> >>>>>>>>> "Reasonable men adapt themselves to the world.
>> >>>>>>>>> Unreasonable men adapt the world to themselves.
>> >>>>>>>>> That's why all progress depends on unreasonable
men."
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Filipe David Manana,
>> >>>>>>> fdmanana@apache.org
>> >>>>>>>
>> >>>>>>> "Reasonable men adapt themselves to the world.
>> >>>>>>> Unreasonable men adapt the world to themselves.
>> >>>>>>> That's why all progress depends on unreasonable men."
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>
>> >
>>
>>
>

Mime
View raw message