couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: view response with duplicate id's
Date Thu, 07 Oct 2010 20:33:32 GMT
If you're ok posting it somewhere and its not too big that might be
the easiest way to debug this.

On Thu, Oct 7, 2010 at 3:01 PM, Alexey Loshkarev <elf2001@gmail.com> wrote:
> Can't reproduce it with this script.
> If I send you copy of my buggy database, will it help to improve couchdb?
>
> 2010/10/7 Paul Davis <paul.joseph.davis@gmail.com>:
>> Alexey,
>>
>> I tried writing a script that'd hammer a single document and make
>> random view requests to see if I could reproduce the issue. Currently
>> the test doc is on revision 6,000 or so and I've not reproduced your
>> issue. I've included the script below, can you try it or see anything
>> that I should add to try and get closer to your situation?
>>
>> Paul
>>
>> #! /usr/bin/env python
>>
>> import random
>> import couchdbkit
>>
>> def main():
>>    server = couchdbkit.Server("http://127.0.0.1:5984/")
>>    db = server.get_or_create_db("foo")
>>
>>    ddocid = "_design/baz"
>>    if ddocid not in db:
>>        db[ddocid] = {"views": {"foo": {"map": """
>>            function(doc) {emit(doc._id, doc.value);}
>>        """}}}
>>
>>    assert 0 <= len(db.view("baz/foo")) <= 1
>>
>>    docid = "foo"
>>
>>    if docid in db:
>>        doc = db[docid]
>>    else:
>>        doc = {"_id": "foo", "value": 1}
>>        db[docid] = doc
>>
>>    for i in range(1200):
>>        db[docid] = doc
>>        doc = db[docid]
>>        if random.random() < 0.33:
>>            assert len(db.view("baz/foo")) == 1
>>
>> if __name__ == '__main__':
>>    main()
>>
>>
>> On Thu, Oct 7, 2010 at 1:46 PM, Alexey Loshkarev <elf2001@gmail.com> wrote:
>>> If it helps..
>>>
>>> This q_* documents are some such of state data. They are changed very
>>> frequently.
>>> I have 12 q_* documents and they may be changed 10-30 time per minute.
>>> May be, there are race condition problem in couchdb in view creation?
>>>
>>>
>>> 2010/10/7 Alexey Loshkarev <elf2001@gmail.com>:
>>>> I just tried to move view function to separate design doc and no
>>>> success - duplicates (with same revision) in view response.
>>>>
>>>>
>>>> 2010/10/7 Paul Davis <paul.joseph.davis@gmail.com>:
>>>>> Alexey,
>>>>>
>>>>> Can you show the other views you have in your design doc? Or
>>>>> alternatively, try moving this view to its own design doc?
>>>>>
>>>>> Paul
>>>>>
>>>>> On Thu, Oct 7, 2010 at 1:07 PM, Alexey Loshkarev <elf2001@gmail.com>
wrote:
>>>>>> Same problem appears again.
>>>>>> What was done till yesterday:
>>>>>> 1. Created new database at node2
>>>>>> 2. Replicated from node1 to node2
>>>>>> 3. Checked. _all_docs return only unique rows. queue/all returns
only
>>>>>> unique rows
>>>>>>
>>>>>> After a few hour of stable work, couchdb produce duplicates too.
>>>>>> This time, no duplicate documents (_all_docs has only unique strings),
>>>>>> but duplicate view response.
>>>>>> Remove view index (between couchdb restarts) doesn't help. Couchdb
>>>>>> produce stable duplicates in view.
>>>>>>
>>>>>> View function:
>>>>>> function(doc) {
>>>>>>  if (doc.type == "queue") {
>>>>>>    log("BUG TEST id:" + doc._id + ", rev:" + doc._rev);
>>>>>>    emit(doc.ordering, doc);
>>>>>>  }
>>>>>> }
>>>>>>
>>>>>> Response:
>>>>>> $ curl http://localhost:5984/exhaust/_design/queues/_view/all
>>>>>> {"total_rows":15,"offset":0,"rows":[
>>>>>> ....
>>>>>> {"id":"q_nikolaevka","key":10,"value":{"_id":"q_nikolaevka","_rev":"16181-ae5e5cca96b0491f266bc97c37a88f47","name":"\u041d\u0418\u041a\u041e\u041b\u0410\u0415\u0412\u041a\u0410","default":false,"cars":[],"drivers":[],"ordering":10,"type":"queue"}},
>>>>>> {"id":"q_nikolaevka","key":10,"value":{"_id":"q_nikolaevka","_rev":"16176-3a7bbd128bfb257fd746dfd80769b6fc","name":"\u041d\u0418\u041a\u041e\u041b\u0410\u0415\u0412\u041a\u0410","default":false,"cars":[],"ordering":10,"type":"queue","drivers":[]}},
>>>>>> ...
>>>>>> ]}
>>>>>>
>>>>>>
>>>>>> Saw that? Two documents with different revisions in it!
>>>>>>
>>>>>> Also, couch.log consists of 3 (!) calls of this function for one
document:
>>>>>> [Thu, 07 Oct 2010 16:53:51 GMT] [info] [<0.180.0>] OS Process
>>>>>> #Port<0.2132> Log :: BUG TEST id:q_nikolaevka,
>>>>>> rev:16175-11cedeb529991cf60193d436d1a567e9
>>>>>> [Thu, 07 Oct 2010 16:53:51 GMT] [info] [<0.180.0>] OS Process
>>>>>> #Port<0.2132> Log :: BUG TEST id:q_nikolaevka,
>>>>>> rev:16176-3a7bbd128bfb257fd746dfd80769b6fc
>>>>>> [Thu, 07 Oct 2010 16:53:51 GMT] [info] [<0.180.0>] OS Process
>>>>>> #Port<0.2132> Log :: BUG TEST id:q_nikolaevka,
>>>>>> rev:16181-ae5e5cca96b0491f266bc97c37a88f47
>>>>>>
>>>>>>
>>>>>> Then I do compact to eliminate old revisions.
>>>>>> And now I have 3 duplicates per q_nikolaevka with same revisions!
>>>>>>
>>>>>> I think, I found problem. This document has 1000 revisions in database
>>>>>> and here (http://wiki.apache.org/couchdb/HTTP_database_API) is
>>>>>> described default maximum of 1000 revisions of document.
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2010/10/7 Alexey Loshkarev <elf2001@gmail.com>:
>>>>>>> Haha!
>>>>>>> Fresh replication (into new database) eliminates duplicates and
I can
>>>>>>> sleep quietly.
>>>>>>>
>>>>>>>
>>>>>>> 2010/10/7 Alexey Loshkarev <elf2001@gmail.com>:
>>>>>>>> P.S. dmesg doesn't show any hardware problems (bad blocks,
segfaults
>>>>>>>> and so on).
>>>>>>>> P.P.S. I think, I was migrate 0.10.1 -> 1.0.1 without
database
>>>>>>>> replication, so it may be my fault.
>>>>>>>>
>>>>>>>> 2010/10/7 Alexey Loshkarev <elf2001@gmail.com>:
>>>>>>>>> I think, this is database file corruption. Query _all_docs
returns me
>>>>>>>>> a lot of duplicates (about 3.000 duplicates in ~350.000-documents
>>>>>>>>> database).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [12:17:48 root@node2 (~)]# curl
>>>>>>>>> http://localhost:5984/exhaust/_all_docs > all_docs
>>>>>>>>>  % Total    % Received % Xferd  Average Speed  
Time    Time     Time  Current
>>>>>>>>>                                 Dload
 Upload   Total   Spent    Left  Speed
>>>>>>>>> 100 37.7M    0 37.7M    0     0  1210k      0
--:--:--  0:00:31 --:--:--  943k
>>>>>>>>> [12:18:23 root@node2 (~)]# wc -l all_docs
>>>>>>>>> 325102 all_docs
>>>>>>>>> [12:18:27 root@node2 (~)]# uniq all_docs |wc -l
>>>>>>>>> 322924
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Node1 has duplicates too, but very small amount:
>>>>>>>>> [12:18:48 root@node1 (~)]# curl
>>>>>>>>> http://localhost:5984/exhaust/_all_docs > all_docs
>>>>>>>>>  % Total    % Received % Xferd  Average Speed  
Time    Time     Time  Current
>>>>>>>>>                                 Dload
 Upload   Total   Spent    Left  Speed
>>>>>>>>> 100 38.6M    0 38.6M    0     0   693k      0
--:--:--  0:00:57 --:--:-- 55809
>>>>>>>>> [12:19:57 root@node1 (~)]# wc -l all_docs
>>>>>>>>> 332714 all_docs
>>>>>>>>> [12:20:54 root@node1 (~)]# uniq all_docs |wc -l
>>>>>>>>> 332523
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2010/10/7 Alexey Loshkarev <elf2001@gmail.com>:
>>>>>>>>>> I can't say what specific it may be, so let dive
into history of this
>>>>>>>>>> database(s).
>>>>>>>>>>
>>>>>>>>>> First (before a 5-6 weeks) it was node2 server with
couchdb v10.1.
>>>>>>>>>> There was testing database on it. There were alot
of structural
>>>>>>>>>> changes, view updates and so on.
>>>>>>>>>> Than it becomes production and starts working ok.
>>>>>>>>>> Than we realize we need backup, and best - online
backup (as we have
>>>>>>>>>> couchdb we can do this).
>>>>>>>>>> So, there appears node1 server with couchdb 1.0.1.
I replicated node2
>>>>>>>>>> to node1, than initiates continuous replication node1
-> node2 and
>>>>>>>>>> node2 -> node1. All clients works with node2 only.
All works fine
>>>>>>>>>> about a month.
>>>>>>>>>> Few days before we was at peak load, so I'v want
to use node1 and
>>>>>>>>>> node2 simultaneously. This was done by round-robin
on DNS (host db
>>>>>>>>>> returns 2 different IP - node1's ip and node2's IP).
All works fine
>>>>>>>>>> about 5 minutes, than I gave first conflict (view
queues/all returns
>>>>>>>>>> two identical documents, one - actual version, second
- conflicted
>>>>>>>>>> revision, document with field _conflict=".....").
Document ID was
>>>>>>>>>> q_tsentr.
>>>>>>>>>> As I don't has conflict resolver yet, I resolves
conflict manually by
>>>>>>>>>> deleting conflicted revision. I'v also disables round-robin
and move
>>>>>>>>>> all load to node2 to avoid conflicts for a while
to wrote conflict
>>>>>>>>>> resolver.
>>>>>>>>>>
>>>>>>>>>> It works ok (node1 and node2 in mutual replications,
active load on
>>>>>>>>>> node2) till yesterday.
>>>>>>>>>> Yesterday operator call me he has duplicate data
in program. At this
>>>>>>>>>> queues/all returns 1 duplicated document - the same
as few days before
>>>>>>>>>> (id = q_tsentr). One row consists of actual document
version, another
>>>>>>>>>> row consists of old revision with field _conflicted_revision="some
old
>>>>>>>>>> revision".
>>>>>>>>>>
>>>>>>>>>> I tried to delete this revision but without success.
GET for
>>>>>>>>>> q_tsentr?rev="some old revision" returns valid document.
DELETE
>>>>>>>>>> q_tsentr?rev="some old revision" gaves me 409 error.
>>>>>>>>>> Here are log files (node2):
>>>>>>>>>>
>>>>>>>>>> [Wed, 06 Oct 2010 12:17:19 GMT] [info] [<0.7239.1462>]
10.0.0.41 - -
>>>>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>>>>> [Wed, 06 Oct 2010 12:17:30 GMT] [info] [<0.7245.1462>]
10.0.0.41 - -
>>>>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>>>>> [Wed, 06 Oct 2010 12:17:35 GMT] [info] [<0.7287.1462>]
10.0.0.41 - -
>>>>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>>>>> [Wed, 06 Oct 2010 12:17:43 GMT] [info] [<0.7345.1462>]
10.0.0.41 - -
>>>>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>>>>> [Wed, 06 Oct 2010 12:18:02 GMT] [info] [<0.7864.1462>]
10.0.0.41 - -
>>>>>>>>>> 'DELETE' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
>>>>>>>>>> 409
>>>>>>>>>> [Wed, 06 Oct 2010 12:18:29 GMT] [info] [<0.8331.1462>]
10.0.0.41 - -
>>>>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>>>>> [Wed, 06 Oct 2010 12:18:39 GMT] [info] [<0.8363.1462>]
10.0.0.41 - -
>>>>>>>>>> 'DELETE' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
>>>>>>>>>> 409
>>>>>>>>>> [Wed, 06 Oct 2010 12:38:19 GMT] [info] [<0.16765.1462>]
10.0.0.41 - -
>>>>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>>>>> [Wed, 06 Oct 2010 12:40:40 GMT] [info] [<0.17337.1462>]
10.0.0.41 - -
>>>>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>>>>> [Wed, 06 Oct 2010 12:40:45 GMT] [info] [<0.17344.1462>]
10.0.0.41 - -
>>>>>>>>>> 'DELETE' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
>>>>>>>>>> 404
>>>>>>>>>>
>>>>>>>>>> Logs at node1:
>>>>>>>>>>
>>>>>>>>>> [Wed, 06 Oct 2010 12:17:46 GMT] [info] [<0.25979.462>]
10.20.20.13 - -
>>>>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>>>>> [Wed, 06 Oct 2010 12:17:56 GMT] [info] [<0.26002.462>]
10.20.20.13 - -
>>>>>>>>>> 'DELETE' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
>>>>>>>>>> 200
>>>>>>>>>> [Wed, 06 Oct 2010 12:21:25 GMT] [info] [<0.27133.462>]
10.20.20.13 - -
>>>>>>>>>> 'DELETE' /exhaust/q_tsentr?rev=all 404
>>>>>>>>>> [Wed, 06 Oct 2010 12:21:49 GMT] [info] [<0.27179.462>]
10.20.20.13 - -
>>>>>>>>>> 'DELETE' /exhaust/q_tsentr?revs=true 404
>>>>>>>>>> [Wed, 06 Oct 2010 12:24:41 GMT] [info] [<0.28959.462>]
10.20.20.13 - -
>>>>>>>>>> 'DELETE' /exhaust/q_tsentr?revs=true 404
>>>>>>>>>> [Wed, 06 Oct 2010 12:38:07 GMT] [info] [<0.10362.463>]
10.20.20.13 - -
>>>>>>>>>> 'GET' /exhaust/q_tsentr?revs=all 404
>>>>>>>>>> [Wed, 06 Oct 2010 12:38:23 GMT] [info] [<0.10534.463>]
10.20.20.13 - -
>>>>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>>>>> [Wed, 06 Oct 2010 12:40:25 GMT] [info] [<0.12014.463>]
10.20.20.13 - -
>>>>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>>>>> [Wed, 06 Oct 2010 12:40:33 GMT] [info] [<0.12109.463>]
10.20.20.13 - -
>>>>>>>>>> 'DELETE' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
>>>>>>>>>> 404
>>>>>>>>>>
>>>>>>>>>> So, I deletes this document and creates new one (id
- q_tsentr2).
>>>>>>>>>> It will works fine about hour.
>>>>>>>>>>
>>>>>>>>>> Node2 has undeletable duplicate, so I move all clients
to node1. There
>>>>>>>>>> were now such problem, view response was correct.
>>>>>>>>>>
>>>>>>>>>> Than I tried to recover database at node2. I stops,
deletes view index
>>>>>>>>>> files and start couchdb again. Than i ping all view
to recreate index.
>>>>>>>>>> At the end ot this procedure, i saw duplicates of
identical rows (see
>>>>>>>>>> first letter in this thread). Node1 has no such problems,
so I stops
>>>>>>>>>> replication, leave load on node1 and go for crying
into this maillist.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2010/10/6 Paul Davis <paul.joseph.davis@gmail.com>:
>>>>>>>>>>> It was noted on IRC that I should give a bit
more explanation.
>>>>>>>>>>>
>>>>>>>>>>> With the information that you've provided there
are two possible
>>>>>>>>>>> explanations. Either your client code is not
doing what you expect or
>>>>>>>>>>> you've triggered a really crazy bug in the view
indexer that caused it
>>>>>>>>>>> to reindex a database without invalidating a
view and not removing
>>>>>>>>>>> keys for docs when it reindexed.
>>>>>>>>>>>
>>>>>>>>>>> Given that no one has reported anything remotely
like this and I can't
>>>>>>>>>>> immediately see a code path that would violate
so many behaviours in
>>>>>>>>>>> the view updater, I'm leaning towards this being
an issue in the
>>>>>>>>>>> client code.
>>>>>>>>>>>
>>>>>>>>>>> If there was something specific that changed
since the view worked,
>>>>>>>>>>> that might illuminate what could cause this sort
of behaviour if it is
>>>>>>>>>>> indeed a bug in CouchDB.
>>>>>>>>>>>
>>>>>>>>>>> HTH,
>>>>>>>>>>> Paul Davis
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Oct 6, 2010 at 12:24 PM, Alexey Loshkarev
<elf2001@gmail.com> wrote:
>>>>>>>>>>>> I have such view function (map only, without
reduce)
>>>>>>>>>>>>
>>>>>>>>>>>> function(doc) {
>>>>>>>>>>>>  if (doc.type == "queue") {
>>>>>>>>>>>>    emit(doc.ordering, doc.drivers);
>>>>>>>>>>>>  }
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> It works perfect till yesterday, but today
it start return duplicates
>>>>>>>>>>>> Example:
>>>>>>>>>>>> $ curl http://node2:5984/exhaust/_design/queues/_view/all
>>>>>>>>>>>>
>>>>>>>>>>>> {"total_rows":46,"offset":0,"rows":[
>>>>>>>>>>>> {"id":"q_mashinyi-v-gorode","key":0,"value":["d_mironets_ivan","d_smertin_ivan","d_kasyanenko_sergej","d_chabotar_aleksandr","d_martyinenko_yurij","d_krikunenko_aleksandr"]},
>>>>>>>>>>>> {"id":"q_mashinyi-v-gorode","key":0,"value":["d_mironets_ivan","d_smertin_ivan","d_kasyanenko_sergej","d_chabotar_aleksandr","d_martyinenko_yurij","d_krikunenko_aleksandr"]},
>>>>>>>>>>>> {"id":"q_mashinyi-v-gorode","key":0,"value":["d_mironets_ivan","d_smertin_ivan","d_kasyanenko_sergej","d_chabotar_aleksandr","d_martyinenko_yurij","d_krikunenko_aleksandr"]},
>>>>>>>>>>>> ......
>>>>>>>>>>>> {"id":"q_oblasnaya","key":2,"value":["d_kramarenko_viktor","d_skorodzievskij_eduard"]},
>>>>>>>>>>>> {"id":"q_oblasnaya","key":2,"value":["d_kramarenko_viktor","d_skorodzievskij_eduard"]},
>>>>>>>>>>>> {"id":"q_oblasnaya","key":2,"value":["d_kramarenko_viktor","d_skorodzievskij_eduard"]},
>>>>>>>>>>>> ........
>>>>>>>>>>>> {"id":"q_otstoj","key":11,"value":["d_gavrilenko_aleksandr","d_klishnev_sergej"]}
>>>>>>>>>>>> ]}
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I tried to restart server, recreate view
(remove view index file),
>>>>>>>>>>>> compact view and database and none of this
helps, it still returns
>>>>>>>>>>>> duplicates.
>>>>>>>>>>>> What happens? How to avoid it in the future?
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> ----------------
>>>>>>>>>>>> Best regards
>>>>>>>>>>>> Alexey Loshkarev
>>>>>>>>>>>> mailto:elf2001@gmail.com
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> ----------------
>>>>>>>>>> Best regards
>>>>>>>>>> Alexey Loshkarev
>>>>>>>>>> mailto:elf2001@gmail.com
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> ----------------
>>>>>>>>> Best regards
>>>>>>>>> Alexey Loshkarev
>>>>>>>>> mailto:elf2001@gmail.com
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ----------------
>>>>>>>> Best regards
>>>>>>>> Alexey Loshkarev
>>>>>>>> mailto:elf2001@gmail.com
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ----------------
>>>>>>> Best regards
>>>>>>> Alexey Loshkarev
>>>>>>> mailto:elf2001@gmail.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ----------------
>>>>>> Best regards
>>>>>> Alexey Loshkarev
>>>>>> mailto:elf2001@gmail.com
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ----------------
>>>> Best regards
>>>> Alexey Loshkarev
>>>> mailto:elf2001@gmail.com
>>>>
>>>
>>>
>>>
>>> --
>>> ----------------
>>> Best regards
>>> Alexey Loshkarev
>>> mailto:elf2001@gmail.com
>>>
>>
>
>
>
> --
> ----------------
> Best regards
> Alexey Loshkarev
> mailto:elf2001@gmail.com
>

Mime
View raw message