incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikeal Rogers <mikeal.rog...@gmail.com>
Subject Re: data recovery tool progress
Date Tue, 10 Aug 2010 08:09:40 GMT
I think I found a bug in the current lost+found repair.

I've been running it against the testwritesdb and it's in a state that is
never finishing.

It's still spitting out these lines:

[info] [<0.32.0>] writing 1001 updates to lost+found/testwritesdb

Most are 1001 but there are also other random variances 452, 866, etc.

But the file size and dbinfo hasn't budged in over 30 minutes. The size is
stuck at 34300002 with the original db file being 54857478 .

This database only has one document in it that isn't "lost" so if it's
finding *any* new docs it should be writing them.

I also started another job to recover a production db that is quite large,
500megs, with the missing data a week or so back. This has been running for
2 hours and has still not output anything or created the lost and found db
so I can only assume that it is in the same state.

Both machines are still churning 100% CPU.

-Mikeal


On Mon, Aug 9, 2010 at 11:26 PM, Adam Kocoloski <kocolosk@apache.org> wrote:

> With Randall's help we hooked the new node scanner up to the lost+found DB
> generator.  It seems to work well enough for small DBs; for large DBs with
> lots of missing nodes the O(N^2) complexity of the problem catches up to the
> code and generating the lost+found DB takes quite some time.  Mikeal is
> running tests tonight.  The algo appears pretty CPU-limited, so a little
> parallelization may be warranted.
>
> http://github.com/kocolosk/couchdb/tree/db_repair
>
> Adam
>
> (I sent this previous update to myself instead of the list, so I'll forward
> it here ...)
>
> On Aug 10, 2010, at 12:01 AM, Adam Kocoloski wrote:
>
> > On Aug 9, 2010, at 10:10 PM, Adam Kocoloski wrote:
> >
> >> Right, make_lost_and_found still relies on code which reads through
> couch_file one byte at a time, that's the cause of the slowness.  The newer
> scanner will improve that pretty dramatically, and we can tune it further by
> increasing the length of the pattern that we match when looking for
> kp/kv_node terms in the files, at the expense of some extra complexity
> dealing with the block prefixes (currently it does a 1-byte match, which as
> I understand it cannot be split across blocks).
> >
> > The scanner now looks for a 7 byte match, unless it is within 6 bytes of
> a block boundary, in which case it looks for the longest possible match at
> that position.  The more specific match condition greatly reduces the # of
> calls to couch_file, and thus boosts the throughput.  On my laptop it can
> scan the testwritesdb.couch from Mikeal's couchtest repo (52 MB) in 18
> seconds.
> >
> >> Regarding the file_corruption error on the larger file, I think this is
> something we will just naturally trigger when we take a guess that random
> positions in a file are actually the beginning of a term.  I think our best
> recourse here is to return {error, file_corruption} from couch_file but
> leave the gen_server up and running instead of terminating it.  That way the
> repair code can ignore the error and keep moving without having to reopen
> the file.
> >
> > I committed this change (to my db_repair branch) after consulting with
> Chris.  The longer match condition makes these spurious file_corruption
> triggers much less likely, but I think it's still a good thing not to crash
> the server when they happen.
> >
> >> Next steps as I understand them - Randall is working on integrating the
> in-memory scanner into Volker's code that finds all the dangling by_id
> nodes.  I'm working on making sure that the scanner identifies bt node
> candidates which span block prefixes, and on improving its pattern-matching.
> >
> > Latest from my end
> > http://github.com/kocolosk/couchdb/tree/db_repair
> >
> >>
> >> Adam
> >>
> >> On Aug 9, 2010, at 9:50 PM, Mikeal Rogers wrote:
> >>
> >>> I pulled down the latest code from Adam's branch @
> >>> 7080ff72baa329cf6c4be2a79e71a41f744ed93b.
> >>>
> >>> Running timer:tc(couch_db_repair, make_lost_and_found,
> ["multi_conflict"]).
> >>> on a database with 200 lost updates spanning 200 restarts (
> >>> http://github.com/mikeal/couchtest/blob/master/multi_conflict.couch )
> took
> >>> about 101 seconds.
> >>>
> >>> I tried running against a larger databases (
> >>> http://github.com/mikeal/couchtest/blob/master/testwritesdb.couch )
> and I
> >>> got this exception:
> >>>
> >>> http://gist.github.com/516491
> >>>
> >>> -Mikeal
> >>>
> >>>
> >>>
> >>> On Mon, Aug 9, 2010 at 6:09 PM, Randall Leeds <randall.leeds@gmail.com
> >wrote:
> >>>
> >>>> Summing up what went on in IRC for those who were absent.
> >>>>
> >>>> The latest progress is on Adam's branch at
> >>>> http://github.com/kocolosk/couchdb/tree/db_repair
> >>>>
> >>>> couch_db_repair:make_lost_and_found/1 attempts to create a new
> >>>> lost+found/DbName database to which it merges all nodes not accessible
> >>>> from anywhere (any other node found in a full file scan or any header
> >>>> pointers).
> >>>>
> >>>> Currently, make_lost_and_found uses Volker's repair (from
> >>>> couch_db_repair_b module, also in Adam's branch).
> >>>> Adam found that the bottleneck was couch_file calls and that the
> >>>> repair process was taking a very long time so he added
> >>>> couch_db_repair:find_nodes_quickly/1 that reads 1MB chunks as binary
> >>>> and tries to process it to find nodes instead of scanning back one
> >>>> byte at a time. It is currently not hooked up to the repair mechanism.
> >>>>
> >>>> Making progress. Go team.
> >>>>
> >>>> On Mon, Aug 9, 2010 at 13:52, Mikeal Rogers <mikeal.rogers@gmail.com>
> >>>> wrote:
> >>>>> jchris suggested on IRC that I try a normal doc update and see if
> that
> >>>> fixes
> >>>>> it.
> >>>>>
> >>>>> It does. After a new doc was created the dbinfo doc count was back
to
> >>>>> normal.
> >>>>>
> >>>>> -Mikeal
> >>>>>
> >>>>> On Mon, Aug 9, 2010 at 1:39 PM, Mikeal Rogers <
> mikeal.rogers@gmail.com
> >>>>> wrote:
> >>>>>
> >>>>>> Ok, I pulled down this code and tested against a database with
a ton
> of
> >>>>>> missing writes right before a single restart.
> >>>>>>
> >>>>>> Before restart this was the database:
> >>>>>>
> >>>>>> {
> >>>>>> db_name: "testwritesdb"
> >>>>>> doc_count: 124969
> >>>>>> doc_del_count: 0
> >>>>>> update_seq: 124969
> >>>>>> purge_seq: 0
> >>>>>> compact_running: false
> >>>>>> disk_size: 54857478
> >>>>>> instance_start_time: "1281384140058211"
> >>>>>> disk_format_version: 5
> >>>>>> }
> >>>>>>
> >>>>>> After restart it was this:
> >>>>>>
> >>>>>> {
> >>>>>> db_name: "testwritesdb"
> >>>>>> doc_count: 1
> >>>>>> doc_del_count: 0
> >>>>>> update_seq: 1
> >>>>>> purge_seq: 0
> >>>>>> compact_running: false
> >>>>>> disk_size: 54857478
> >>>>>> instance_start_time: "1281384593876026"
> >>>>>> disk_format_version: 5
> >>>>>> }
> >>>>>>
> >>>>>> After repair, it's this:
> >>>>>>
> >>>>>> {
> >>>>>> db_name: "testwritesdb"
> >>>>>> doc_count: 1
> >>>>>> doc_del_count: 0
> >>>>>> update_seq: 124969
> >>>>>> purge_seq: 0
> >>>>>> compact_running: false
> >>>>>> disk_size: 54857820
> >>>>>> instance_start_time: "1281385990193289"
> >>>>>> disk_format_version: 5
> >>>>>> committed_update_seq: 124969
> >>>>>> }
> >>>>>>
> >>>>>> All the sequences are there and hitting _all_docs shows all
the
> >>>> documents
> >>>>>> so why is the doc_count only 1 in the dbinfo?
> >>>>>>
> >>>>>> -Mikeal
> >>>>>>
> >>>>>> On Mon, Aug 9, 2010 at 11:53 AM, Filipe David Manana <
> >>>> fdmanana@apache.org>wrote:
> >>>>>>
> >>>>>>> For the record (and people not on IRC), the code at:
> >>>>>>>
> >>>>>>> http://github.com/fdmanana/couchdb/commits/db_repair
> >>>>>>>
> >>>>>>> is working for at least simple cases. Use
> >>>>>>> couch_db_repair:repair(DbNameAsString).
> >>>>>>> There's one TODO:  update the reduce values for the by_seq
and
> by_id
> >>>>>>> BTrees.
> >>>>>>>
> >>>>>>> If anyone wants to give some help on this, your welcome.
> >>>>>>>
> >>>>>>> On Mon, Aug 9, 2010 at 6:12 PM, Mikeal Rogers <
> mikeal.rogers@gmail.com
> >>>>>>>> wrote:
> >>>>>>>
> >>>>>>>> I'm starting to create a bunch of test db files that
expose this
> bug
> >>>>>>> under
> >>>>>>>> different conditions like multiple restarts, across
compaction,
> >>>>>>> variances
> >>>>>>>> in
> >>>>>>>> updates the might cause conflict, etc.
> >>>>>>>>
> >>>>>>>> http://github.com/mikeal/couchtest
> >>>>>>>>
> >>>>>>>> The README outlines what was done to the db's and what
needs to be
> >>>>>>>> recovered.
> >>>>>>>>
> >>>>>>>> -Mikeal
> >>>>>>>>
> >>>>>>>> On Mon, Aug 9, 2010 at 9:33 AM, Filipe David Manana
<
> >>>>>>> fdmanana@apache.org
> >>>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> On Mon, Aug 9, 2010 at 5:22 PM, Robert Newson <
> >>>>>>> robert.newson@gmail.com
> >>>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Doesn't this bit;
> >>>>>>>>>>
> >>>>>>>>>> -        Db#db{waiting_delayed_commit=nil};
> >>>>>>>>>> +        Db;
> >>>>>>>>>> +        % Db#db{waiting_delayed_commit=nil};
> >>>>>>>>>>
> >>>>>>>>>> revert the bug fix?
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> That's intentional, for my local testing.
> >>>>>>>>> That patch isn't obviously anything close to final,
it's too
> >>>>>>> experimental
> >>>>>>>>> yet.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> B.
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Aug 9, 2010 at 5:09 PM, Jan Lehnardt
<jan@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>>> Hi All,
> >>>>>>>>>>>
> >>>>>>>>>>> Filipe jumped in to start working on the
recovery tool, but he
> >>>>>>> isn't
> >>>>>>>>> done
> >>>>>>>>>> yet.
> >>>>>>>>>>>
> >>>>>>>>>>> Here's the current patch:
> >>>>>>>>>>>
> >>>>>>>>>>> http://www.friendpaste.com/4uMngrym4r7Zz4R0ThSHbz
> >>>>>>>>>>>
> >>>>>>>>>>> it is not done and very early, but any help
on this is greatly
> >>>>>>>>>> appreciated.
> >>>>>>>>>>>
> >>>>>>>>>>> The current state is (in Filipe's words):
> >>>>>>>>>>> - i can detect that a file needs repair
> >>>>>>>>>>> - and get the last btree roots from it
> >>>>>>>>>>> - "only" missing: get last db seq num
> >>>>>>>>>>> - write new header
> >>>>>>>>>>> - and deal with the local docs btree (if
exists)
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks!
> >>>>>>>>>>> Jan
> >>>>>>>>>>> --
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Filipe David Manana,
> >>>>>>>>> fdmanana@apache.org
> >>>>>>>>>
> >>>>>>>>> "Reasonable men adapt themselves to the world.
> >>>>>>>>> Unreasonable men adapt the world to themselves.
> >>>>>>>>> That's why all progress depends on unreasonable
men."
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Filipe David Manana,
> >>>>>>> fdmanana@apache.org
> >>>>>>>
> >>>>>>> "Reasonable men adapt themselves to the world.
> >>>>>>> Unreasonable men adapt the world to themselves.
> >>>>>>> That's why all progress depends on unreasonable men."
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message