couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Joseph Davis (JIRA)" <j...@apache.org>
Subject [jira] Commented: (COUCHDB-968) Duplicated IDs in _all_docs
Date Tue, 30 Nov 2010 06:23:11 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965114#action_12965114
] 

Paul Joseph Davis commented on COUCHDB-968:
-------------------------------------------

Sorry for the delay, ended up having a flight cancelled and got rerouted and ended up not
making it home till just now.

I'm not sure I quite follow what you mean by uncompacted here. I would expect post compaction
when we see the issue in _all_docs that they all have the same update_seq. Pre compaction
in _changes I would expect the same _revision (I think, just guessing) because it's just iterating
the by_seqid_btree and then displaying the update_seq from the actual #full_doc_info (I think,
just guessing).

As Bob Dionne noted in #couchdb, its not entirely clear where the actual bug is. Right now
its a combination of three things basically: couch_key_tree:stem kinda sorta fails when merging
two revision lists that exceed the rev_limit setting. Once that fails, we hit another issue
that results in two entries in the by_seqid_btree, and then finally, compaction copies multiple
docs to the actual by_docid_btree.

After musing on it during the copious amounts of queueing I managed to accomplish today, I
think that we should treat them as three bugs right now. My proposed fixes are basically such:

1. Fix couch_key_tree:stem so that it takes into account when the input write has a suffix
that is a prefix of an existing edit path. This would avoid the rewrite that fixes everything.

2. We need to figure out a way to fix the breakage of the update_seq. Its a bit nebulous on
whether this is an actual bug as the soution to #1 would fix all known occurences of this.
I think the proper fix would be revisit couch_db_updater:merge_rev_trees and figure out a
better way of picking the new update_seq (which would basically need to detect if an edit
leaf was changed and only if so, update the update_seq.

3. Our btree implementation should probably check harder for the possibility of adding duplicate
keys. The basic bug is that its a possibility in a single call to query_modify. A simple solution
that I've implemented (that would impact all calls to query_modify) would be to check the
input list of actions for duplicates. Ie, just iterate over the Actions list and find duplicate
{Action, Key, _Value} tuples. (Ie, ignore differing values). Alternatively, a check deep down
in modify_kvnode could discard Action/Key pairs that are greater than the last entry in ResultNode
there by selecting one of the actions semi randomly (or alternatively, throw an error when
not). I think technically, both are O(N) with N the size of the list of Actions that were
requested.

That is all. I'll look more tomorrow. Right now its time for beer and a bit of zoning out
in front of the tele before I pass out.

> Duplicated IDs in _all_docs
> ---------------------------
>
>                 Key: COUCHDB-968
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-968
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>    Affects Versions: 0.10.1, 0.10.2, 0.11.1, 0.11.2, 1.0, 1.0.1, 1.0.2
>         Environment: Ubuntu 10.04.
>            Reporter: Sebastian Cohnen
>            Priority: Blocker
>
> We have a database, which is causing serious trouble with compaction and replication
(huge memory and cpu usage, often causing couchdb to crash b/c all system memory is exhausted).
Yesterday we discovered that db/_all_docs is reporting duplicated IDs (see [1]). Until a few
minutes ago we thought that there are only few duplicates but today I took a closer look and
I found 10 IDs which sum up to a total of 922 duplicates. Some of them have only 1 duplicate,
others have hundreds.
> Some facts about the database in question:
> * ~13k documents, with 3-5k revs each
> * all duplicated documents are in conflict (with 1 up to 14 conflicts)
> * compaction is run on a daily bases
> * several thousands updates per hour
> * multi-master setup with pull replication from each other
> * delayed_commits=false on all nodes
> * used couchdb versions 1.0.0 and 1.0.x (*)
> Unfortunately the database's contents are confidential and I'm not allowed to publish
it.
> [1]: Part of http://localhost:5984/DBNAME/_all_docs
> ...
> {"id":"9997","key":"9997","value":{"rev":"6096-603c68c1fa90ac3f56cf53771337ac9f"}},
> {"id":"9999","key":"9999","value":{"rev":"6097-3c873ccf6875ff3c4e2c6fa264c6a180"}},
> {"id":"9999","key":"9999","value":{"rev":"6097-3c873ccf6875ff3c4e2c6fa264c6a180"}},
> ...
> [*]
> There were two (old) servers (1.0.0) in production (already having the replication and
compaction issues). Then two servers (1.0.x) were added and replication was set up to bring
them in sync with the old production servers since the two new servers were meant to replace
the old ones (to update node.js application code among other things).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message