incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Loshkarev <elf2...@gmail.com>
Subject Re: view response with duplicate id's
Date Thu, 07 Oct 2010 17:46:05 GMT
If it helps..

This q_* documents are some such of state data. They are changed very
frequently.
I have 12 q_* documents and they may be changed 10-30 time per minute.
May be, there are race condition problem in couchdb in view creation?


2010/10/7 Alexey Loshkarev <elf2001@gmail.com>:
> I just tried to move view function to separate design doc and no
> success - duplicates (with same revision) in view response.
>
>
> 2010/10/7 Paul Davis <paul.joseph.davis@gmail.com>:
>> Alexey,
>>
>> Can you show the other views you have in your design doc? Or
>> alternatively, try moving this view to its own design doc?
>>
>> Paul
>>
>> On Thu, Oct 7, 2010 at 1:07 PM, Alexey Loshkarev <elf2001@gmail.com> wrote:
>>> Same problem appears again.
>>> What was done till yesterday:
>>> 1. Created new database at node2
>>> 2. Replicated from node1 to node2
>>> 3. Checked. _all_docs return only unique rows. queue/all returns only
>>> unique rows
>>>
>>> After a few hour of stable work, couchdb produce duplicates too.
>>> This time, no duplicate documents (_all_docs has only unique strings),
>>> but duplicate view response.
>>> Remove view index (between couchdb restarts) doesn't help. Couchdb
>>> produce stable duplicates in view.
>>>
>>> View function:
>>> function(doc) {
>>>  if (doc.type == "queue") {
>>>    log("BUG TEST id:" + doc._id + ", rev:" + doc._rev);
>>>    emit(doc.ordering, doc);
>>>  }
>>> }
>>>
>>> Response:
>>> $ curl http://localhost:5984/exhaust/_design/queues/_view/all
>>> {"total_rows":15,"offset":0,"rows":[
>>> ....
>>> {"id":"q_nikolaevka","key":10,"value":{"_id":"q_nikolaevka","_rev":"16181-ae5e5cca96b0491f266bc97c37a88f47","name":"\u041d\u0418\u041a\u041e\u041b\u0410\u0415\u0412\u041a\u0410","default":false,"cars":[],"drivers":[],"ordering":10,"type":"queue"}},
>>> {"id":"q_nikolaevka","key":10,"value":{"_id":"q_nikolaevka","_rev":"16176-3a7bbd128bfb257fd746dfd80769b6fc","name":"\u041d\u0418\u041a\u041e\u041b\u0410\u0415\u0412\u041a\u0410","default":false,"cars":[],"ordering":10,"type":"queue","drivers":[]}},
>>> ...
>>> ]}
>>>
>>>
>>> Saw that? Two documents with different revisions in it!
>>>
>>> Also, couch.log consists of 3 (!) calls of this function for one document:
>>> [Thu, 07 Oct 2010 16:53:51 GMT] [info] [<0.180.0>] OS Process
>>> #Port<0.2132> Log :: BUG TEST id:q_nikolaevka,
>>> rev:16175-11cedeb529991cf60193d436d1a567e9
>>> [Thu, 07 Oct 2010 16:53:51 GMT] [info] [<0.180.0>] OS Process
>>> #Port<0.2132> Log :: BUG TEST id:q_nikolaevka,
>>> rev:16176-3a7bbd128bfb257fd746dfd80769b6fc
>>> [Thu, 07 Oct 2010 16:53:51 GMT] [info] [<0.180.0>] OS Process
>>> #Port<0.2132> Log :: BUG TEST id:q_nikolaevka,
>>> rev:16181-ae5e5cca96b0491f266bc97c37a88f47
>>>
>>>
>>> Then I do compact to eliminate old revisions.
>>> And now I have 3 duplicates per q_nikolaevka with same revisions!
>>>
>>> I think, I found problem. This document has 1000 revisions in database
>>> and here (http://wiki.apache.org/couchdb/HTTP_database_API) is
>>> described default maximum of 1000 revisions of document.
>>>
>>>
>>>
>>> 2010/10/7 Alexey Loshkarev <elf2001@gmail.com>:
>>>> Haha!
>>>> Fresh replication (into new database) eliminates duplicates and I can
>>>> sleep quietly.
>>>>
>>>>
>>>> 2010/10/7 Alexey Loshkarev <elf2001@gmail.com>:
>>>>> P.S. dmesg doesn't show any hardware problems (bad blocks, segfaults
>>>>> and so on).
>>>>> P.P.S. I think, I was migrate 0.10.1 -> 1.0.1 without database
>>>>> replication, so it may be my fault.
>>>>>
>>>>> 2010/10/7 Alexey Loshkarev <elf2001@gmail.com>:
>>>>>> I think, this is database file corruption. Query _all_docs returns
me
>>>>>> a lot of duplicates (about 3.000 duplicates in ~350.000-documents
>>>>>> database).
>>>>>>
>>>>>>
>>>>>> [12:17:48 root@node2 (~)]# curl
>>>>>> http://localhost:5984/exhaust/_all_docs > all_docs
>>>>>>  % Total    % Received % Xferd  Average Speed   Time    Time
    Time  Current
>>>>>>                                 Dload  Upload  
Total   Spent    Left  Speed
>>>>>> 100 37.7M    0 37.7M    0     0  1210k      0 --:--:--
 0:00:31 --:--:--  943k
>>>>>> [12:18:23 root@node2 (~)]# wc -l all_docs
>>>>>> 325102 all_docs
>>>>>> [12:18:27 root@node2 (~)]# uniq all_docs |wc -l
>>>>>> 322924
>>>>>>
>>>>>>
>>>>>> Node1 has duplicates too, but very small amount:
>>>>>> [12:18:48 root@node1 (~)]# curl
>>>>>> http://localhost:5984/exhaust/_all_docs > all_docs
>>>>>>  % Total    % Received % Xferd  Average Speed   Time    Time
    Time  Current
>>>>>>                                 Dload  Upload  
Total   Spent    Left  Speed
>>>>>> 100 38.6M    0 38.6M    0     0   693k      0 --:--:--
 0:00:57 --:--:-- 55809
>>>>>> [12:19:57 root@node1 (~)]# wc -l all_docs
>>>>>> 332714 all_docs
>>>>>> [12:20:54 root@node1 (~)]# uniq all_docs |wc -l
>>>>>> 332523
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2010/10/7 Alexey Loshkarev <elf2001@gmail.com>:
>>>>>>> I can't say what specific it may be, so let dive into history
of this
>>>>>>> database(s).
>>>>>>>
>>>>>>> First (before a 5-6 weeks) it was node2 server with couchdb v10.1.
>>>>>>> There was testing database on it. There were alot of structural
>>>>>>> changes, view updates and so on.
>>>>>>> Than it becomes production and starts working ok.
>>>>>>> Than we realize we need backup, and best - online backup (as
we have
>>>>>>> couchdb we can do this).
>>>>>>> So, there appears node1 server with couchdb 1.0.1. I replicated
node2
>>>>>>> to node1, than initiates continuous replication node1 -> node2
and
>>>>>>> node2 -> node1. All clients works with node2 only. All works
fine
>>>>>>> about a month.
>>>>>>> Few days before we was at peak load, so I'v want to use node1
and
>>>>>>> node2 simultaneously. This was done by round-robin on DNS (host
db
>>>>>>> returns 2 different IP - node1's ip and node2's IP). All works
fine
>>>>>>> about 5 minutes, than I gave first conflict (view queues/all
returns
>>>>>>> two identical documents, one - actual version, second - conflicted
>>>>>>> revision, document with field _conflict="....."). Document ID
was
>>>>>>> q_tsentr.
>>>>>>> As I don't has conflict resolver yet, I resolves conflict manually
by
>>>>>>> deleting conflicted revision. I'v also disables round-robin and
move
>>>>>>> all load to node2 to avoid conflicts for a while to wrote conflict
>>>>>>> resolver.
>>>>>>>
>>>>>>> It works ok (node1 and node2 in mutual replications, active load
on
>>>>>>> node2) till yesterday.
>>>>>>> Yesterday operator call me he has duplicate data in program.
At this
>>>>>>> queues/all returns 1 duplicated document - the same as few days
before
>>>>>>> (id = q_tsentr). One row consists of actual document version,
another
>>>>>>> row consists of old revision with field _conflicted_revision="some
old
>>>>>>> revision".
>>>>>>>
>>>>>>> I tried to delete this revision but without success. GET for
>>>>>>> q_tsentr?rev="some old revision" returns valid document. DELETE
>>>>>>> q_tsentr?rev="some old revision" gaves me 409 error.
>>>>>>> Here are log files (node2):
>>>>>>>
>>>>>>> [Wed, 06 Oct 2010 12:17:19 GMT] [info] [<0.7239.1462>]
10.0.0.41 - -
>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>> [Wed, 06 Oct 2010 12:17:30 GMT] [info] [<0.7245.1462>]
10.0.0.41 - -
>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>> [Wed, 06 Oct 2010 12:17:35 GMT] [info] [<0.7287.1462>]
10.0.0.41 - -
>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>> [Wed, 06 Oct 2010 12:17:43 GMT] [info] [<0.7345.1462>]
10.0.0.41 - -
>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>> [Wed, 06 Oct 2010 12:18:02 GMT] [info] [<0.7864.1462>]
10.0.0.41 - -
>>>>>>> 'DELETE' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
>>>>>>> 409
>>>>>>> [Wed, 06 Oct 2010 12:18:29 GMT] [info] [<0.8331.1462>]
10.0.0.41 - -
>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>> [Wed, 06 Oct 2010 12:18:39 GMT] [info] [<0.8363.1462>]
10.0.0.41 - -
>>>>>>> 'DELETE' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
>>>>>>> 409
>>>>>>> [Wed, 06 Oct 2010 12:38:19 GMT] [info] [<0.16765.1462>]
10.0.0.41 - -
>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>> [Wed, 06 Oct 2010 12:40:40 GMT] [info] [<0.17337.1462>]
10.0.0.41 - -
>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>> [Wed, 06 Oct 2010 12:40:45 GMT] [info] [<0.17344.1462>]
10.0.0.41 - -
>>>>>>> 'DELETE' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
>>>>>>> 404
>>>>>>>
>>>>>>> Logs at node1:
>>>>>>>
>>>>>>> [Wed, 06 Oct 2010 12:17:46 GMT] [info] [<0.25979.462>]
10.20.20.13 - -
>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>> [Wed, 06 Oct 2010 12:17:56 GMT] [info] [<0.26002.462>]
10.20.20.13 - -
>>>>>>> 'DELETE' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
>>>>>>> 200
>>>>>>> [Wed, 06 Oct 2010 12:21:25 GMT] [info] [<0.27133.462>]
10.20.20.13 - -
>>>>>>> 'DELETE' /exhaust/q_tsentr?rev=all 404
>>>>>>> [Wed, 06 Oct 2010 12:21:49 GMT] [info] [<0.27179.462>]
10.20.20.13 - -
>>>>>>> 'DELETE' /exhaust/q_tsentr?revs=true 404
>>>>>>> [Wed, 06 Oct 2010 12:24:41 GMT] [info] [<0.28959.462>]
10.20.20.13 - -
>>>>>>> 'DELETE' /exhaust/q_tsentr?revs=true 404
>>>>>>> [Wed, 06 Oct 2010 12:38:07 GMT] [info] [<0.10362.463>]
10.20.20.13 - -
>>>>>>> 'GET' /exhaust/q_tsentr?revs=all 404
>>>>>>> [Wed, 06 Oct 2010 12:38:23 GMT] [info] [<0.10534.463>]
10.20.20.13 - -
>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>> [Wed, 06 Oct 2010 12:40:25 GMT] [info] [<0.12014.463>]
10.20.20.13 - -
>>>>>>> 'GET' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
200
>>>>>>> [Wed, 06 Oct 2010 12:40:33 GMT] [info] [<0.12109.463>]
10.20.20.13 - -
>>>>>>> 'DELETE' /exhaust/q_tsentr?rev=27144-f516ac68e697874eef9c7562f3e2e229
>>>>>>> 404
>>>>>>>
>>>>>>> So, I deletes this document and creates new one (id - q_tsentr2).
>>>>>>> It will works fine about hour.
>>>>>>>
>>>>>>> Node2 has undeletable duplicate, so I move all clients to node1.
There
>>>>>>> were now such problem, view response was correct.
>>>>>>>
>>>>>>> Than I tried to recover database at node2. I stops, deletes view
index
>>>>>>> files and start couchdb again. Than i ping all view to recreate
index.
>>>>>>> At the end ot this procedure, i saw duplicates of identical rows
(see
>>>>>>> first letter in this thread). Node1 has no such problems, so
I stops
>>>>>>> replication, leave load on node1 and go for crying into this
maillist.
>>>>>>>
>>>>>>>
>>>>>>> 2010/10/6 Paul Davis <paul.joseph.davis@gmail.com>:
>>>>>>>> It was noted on IRC that I should give a bit more explanation.
>>>>>>>>
>>>>>>>> With the information that you've provided there are two possible
>>>>>>>> explanations. Either your client code is not doing what you
expect or
>>>>>>>> you've triggered a really crazy bug in the view indexer that
caused it
>>>>>>>> to reindex a database without invalidating a view and not
removing
>>>>>>>> keys for docs when it reindexed.
>>>>>>>>
>>>>>>>> Given that no one has reported anything remotely like this
and I can't
>>>>>>>> immediately see a code path that would violate so many behaviours
in
>>>>>>>> the view updater, I'm leaning towards this being an issue
in the
>>>>>>>> client code.
>>>>>>>>
>>>>>>>> If there was something specific that changed since the view
worked,
>>>>>>>> that might illuminate what could cause this sort of behaviour
if it is
>>>>>>>> indeed a bug in CouchDB.
>>>>>>>>
>>>>>>>> HTH,
>>>>>>>> Paul Davis
>>>>>>>>
>>>>>>>> On Wed, Oct 6, 2010 at 12:24 PM, Alexey Loshkarev <elf2001@gmail.com>
wrote:
>>>>>>>>> I have such view function (map only, without reduce)
>>>>>>>>>
>>>>>>>>> function(doc) {
>>>>>>>>>  if (doc.type == "queue") {
>>>>>>>>>    emit(doc.ordering, doc.drivers);
>>>>>>>>>  }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> It works perfect till yesterday, but today it start return
duplicates
>>>>>>>>> Example:
>>>>>>>>> $ curl http://node2:5984/exhaust/_design/queues/_view/all
>>>>>>>>>
>>>>>>>>> {"total_rows":46,"offset":0,"rows":[
>>>>>>>>> {"id":"q_mashinyi-v-gorode","key":0,"value":["d_mironets_ivan","d_smertin_ivan","d_kasyanenko_sergej","d_chabotar_aleksandr","d_martyinenko_yurij","d_krikunenko_aleksandr"]},
>>>>>>>>> {"id":"q_mashinyi-v-gorode","key":0,"value":["d_mironets_ivan","d_smertin_ivan","d_kasyanenko_sergej","d_chabotar_aleksandr","d_martyinenko_yurij","d_krikunenko_aleksandr"]},
>>>>>>>>> {"id":"q_mashinyi-v-gorode","key":0,"value":["d_mironets_ivan","d_smertin_ivan","d_kasyanenko_sergej","d_chabotar_aleksandr","d_martyinenko_yurij","d_krikunenko_aleksandr"]},
>>>>>>>>> ......
>>>>>>>>> {"id":"q_oblasnaya","key":2,"value":["d_kramarenko_viktor","d_skorodzievskij_eduard"]},
>>>>>>>>> {"id":"q_oblasnaya","key":2,"value":["d_kramarenko_viktor","d_skorodzievskij_eduard"]},
>>>>>>>>> {"id":"q_oblasnaya","key":2,"value":["d_kramarenko_viktor","d_skorodzievskij_eduard"]},
>>>>>>>>> ........
>>>>>>>>> {"id":"q_otstoj","key":11,"value":["d_gavrilenko_aleksandr","d_klishnev_sergej"]}
>>>>>>>>> ]}
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I tried to restart server, recreate view (remove view
index file),
>>>>>>>>> compact view and database and none of this helps, it
still returns
>>>>>>>>> duplicates.
>>>>>>>>> What happens? How to avoid it in the future?
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> ----------------
>>>>>>>>> Best regards
>>>>>>>>> Alexey Loshkarev
>>>>>>>>> mailto:elf2001@gmail.com
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ----------------
>>>>>>> Best regards
>>>>>>> Alexey Loshkarev
>>>>>>> mailto:elf2001@gmail.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ----------------
>>>>>> Best regards
>>>>>> Alexey Loshkarev
>>>>>> mailto:elf2001@gmail.com
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ----------------
>>>>> Best regards
>>>>> Alexey Loshkarev
>>>>> mailto:elf2001@gmail.com
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> ----------------
>>>> Best regards
>>>> Alexey Loshkarev
>>>> mailto:elf2001@gmail.com
>>>>
>>>
>>>
>>>
>>> --
>>> ----------------
>>> Best regards
>>> Alexey Loshkarev
>>> mailto:elf2001@gmail.com
>>>
>>
>
>
>
> --
> ----------------
> Best regards
> Alexey Loshkarev
> mailto:elf2001@gmail.com
>



-- 
----------------
Best regards
Alexey Loshkarev
mailto:elf2001@gmail.com

Mime
View raw message