Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 31688 invoked from network); 7 Oct 2010 17:46:34 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 7 Oct 2010 17:46:34 -0000 Received: (qmail 88269 invoked by uid 500); 7 Oct 2010 17:46:32 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 88223 invoked by uid 500); 7 Oct 2010 17:46:32 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 88215 invoked by uid 99); 7 Oct 2010 17:46:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Oct 2010 17:46:32 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of elf2001@gmail.com designates 209.85.216.180 as permitted sender) Received: from [209.85.216.180] (HELO mail-qy0-f180.google.com) (209.85.216.180) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Oct 2010 17:46:26 +0000 Received: by qyk1 with SMTP id 1so120856qyk.11 for ; Thu, 07 Oct 2010 10:46:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=mVPddRfWCN0aqd4fP55uCsG3cFW8u5vFasd6MjlhMqc=; b=MjMK4iyq9Znt4cmetikhm3tvByrfAFRmFVmj2nafUSl05m0RrykPb/K32bhUWW853+ JdCuJO2mbFhdIZDvWit/PAi0MRoHxFg+vji5bIcdjQ6/514f94UcQ2GsQXHxiBClUhrF VNEpnyrkD7bNhc5fo+SPMmA+QHMxccr9BtKDk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=LIHByPmMIXMx5tDnvYFIFOre7NZhQSQsHbL1ZTo9Gjx0vYTZEbsHYU1vQ/o9alb3M5 ZbCMP0XkAv/4U20GwPnoZaNpVzULAqBrITU4Wyw9+XbUdCPL/Grq9rDI0xF0z6QKf+8W 7GIYD6GI+q0LJmDz1rdBgENOujLQxhCbF5yvI= MIME-Version: 1.0 Received: by 10.224.179.204 with SMTP id br12mr554885qab.52.1286473565111; Thu, 07 Oct 2010 10:46:05 -0700 (PDT) Received: by 10.229.21.6 with HTTP; Thu, 7 Oct 2010 10:46:05 -0700 (PDT) In-Reply-To: References: Date: Thu, 7 Oct 2010 19:46:05 +0200 Message-ID: Subject: Re: view response with duplicate id's From: Alexey Loshkarev To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable If it helps.. This q_* documents are some such of state data. They are changed very frequently. I have 12 q_* documents and they may be changed 10-30 time per minute. May be, there are race condition problem in couchdb in view creation? 2010/10/7 Alexey Loshkarev : > I just tried to move view function to separate design doc and no > success - duplicates (with same revision) in view response. > > > 2010/10/7 Paul Davis : >> Alexey, >> >> Can you show the other views you have in your design doc? Or >> alternatively, try moving this view to its own design doc? >> >> Paul >> >> On Thu, Oct 7, 2010 at 1:07 PM, Alexey Loshkarev wro= te: >>> Same problem appears again. >>> What was done till yesterday: >>> 1. Created new database at node2 >>> 2. Replicated from node1 to node2 >>> 3. Checked. _all_docs return only unique rows. queue/all returns only >>> unique rows >>> >>> After a few hour of stable work, couchdb produce duplicates too. >>> This time, no duplicate documents (_all_docs has only unique strings), >>> but duplicate view response. >>> Remove view index (between couchdb restarts) doesn't help. Couchdb >>> produce stable duplicates in view. >>> >>> View function: >>> function(doc) { >>> =A0if (doc.type =3D=3D "queue") { >>> =A0 =A0log("BUG TEST id:" + doc._id + ", rev:" + doc._rev); >>> =A0 =A0emit(doc.ordering, doc); >>> =A0} >>> } >>> >>> Response: >>> $ curl http://localhost:5984/exhaust/_design/queues/_view/all >>> {"total_rows":15,"offset":0,"rows":[ >>> .... >>> {"id":"q_nikolaevka","key":10,"value":{"_id":"q_nikolaevka","_rev":"161= 81-ae5e5cca96b0491f266bc97c37a88f47","name":"\u041d\u0418\u041a\u041e\u041b= \u0410\u0415\u0412\u041a\u0410","default":false,"cars":[],"drivers":[],"ord= ering":10,"type":"queue"}}, >>> {"id":"q_nikolaevka","key":10,"value":{"_id":"q_nikolaevka","_rev":"161= 76-3a7bbd128bfb257fd746dfd80769b6fc","name":"\u041d\u0418\u041a\u041e\u041b= \u0410\u0415\u0412\u041a\u0410","default":false,"cars":[],"ordering":10,"ty= pe":"queue","drivers":[]}}, >>> ... >>> ]} >>> >>> >>> Saw that? Two documents with different revisions in it! >>> >>> Also, couch.log consists of 3 (!) calls of this function for one docume= nt: >>> [Thu, 07 Oct 2010 16:53:51 GMT] [info] [<0.180.0>] OS Process >>> #Port<0.2132> Log :: BUG TEST id:q_nikolaevka, >>> rev:16175-11cedeb529991cf60193d436d1a567e9 >>> [Thu, 07 Oct 2010 16:53:51 GMT] [info] [<0.180.0>] OS Process >>> #Port<0.2132> Log :: BUG TEST id:q_nikolaevka, >>> rev:16176-3a7bbd128bfb257fd746dfd80769b6fc >>> [Thu, 07 Oct 2010 16:53:51 GMT] [info] [<0.180.0>] OS Process >>> #Port<0.2132> Log :: BUG TEST id:q_nikolaevka, >>> rev:16181-ae5e5cca96b0491f266bc97c37a88f47 >>> >>> >>> Then I do compact to eliminate old revisions. >>> And now I have 3 duplicates per q_nikolaevka with same revisions! >>> >>> I think, I found problem. This document has 1000 revisions in database >>> and here (http://wiki.apache.org/couchdb/HTTP_database_API) is >>> described default maximum of 1000 revisions of document. >>> >>> >>> >>> 2010/10/7 Alexey Loshkarev : >>>> Haha! >>>> Fresh replication (into new database) eliminates duplicates and I can >>>> sleep quietly. >>>> >>>> >>>> 2010/10/7 Alexey Loshkarev : >>>>> P.S. dmesg doesn't show any hardware problems (bad blocks, segfaults >>>>> and so on). >>>>> P.P.S. I think, I was migrate 0.10.1 -> 1.0.1 without database >>>>> replication, so it may be my fault. >>>>> >>>>> 2010/10/7 Alexey Loshkarev : >>>>>> I think, this is database file corruption. Query _all_docs returns m= e >>>>>> a lot of duplicates (about 3.000 duplicates in ~350.000-documents >>>>>> database). >>>>>> >>>>>> >>>>>> [12:17:48 root@node2 (~)]# curl >>>>>> http://localhost:5984/exhaust/_all_docs > all_docs >>>>>> =A0% Total =A0 =A0% Received % Xferd =A0Average Speed =A0 Time =A0 = =A0Time =A0 =A0 Time =A0Current >>>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Dloa= d =A0Upload =A0 Total =A0 Spent =A0 =A0Left =A0Speed >>>>>> 100 37.7M =A0 =A00 37.7M =A0 =A00 =A0 =A0 0 =A01210k =A0 =A0 =A00 --= :--:-- =A00:00:31 --:--:-- =A0943k >>>>>> [12:18:23 root@node2 (~)]# wc -l all_docs >>>>>> 325102 all_docs >>>>>> [12:18:27 root@node2 (~)]# uniq all_docs |wc -l >>>>>> 322924 >>>>>> >>>>>> >>>>>> Node1 has duplicates too, but very small amount: >>>>>> [12:18:48 root@node1 (~)]# curl >>>>>> http://localhost:5984/exhaust/_all_docs > all_docs >>>>>> =A0% Total =A0 =A0% Received % Xferd =A0Average Speed =A0 Time =A0 = =A0Time =A0 =A0 Time =A0Current >>>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Dloa= d =A0Upload =A0 Total =A0 Spent =A0 =A0Left =A0Speed >>>>>> 100 38.6M =A0 =A00 38.6M =A0 =A00 =A0 =A0 0 =A0 693k =A0 =A0 =A00 --= :--:-- =A00:00:57 --:--:-- 55809 >>>>>> [12:19:57 root@node1 (~)]# wc -l all_docs >>>>>> 332714 all_docs >>>>>> [12:20:54 root@node1 (~)]# uniq all_docs |wc -l >>>>>> 332523 >>>>>> >>>>>> >>>>>> >>>>>> 2010/10/7 Alexey Loshkarev : >>>>>>> I can't say what specific it may be, so let dive into history of th= is >>>>>>> database(s). >>>>>>> >>>>>>> First (before a 5-6 weeks) it was node2 server with couchdb v10.1. >>>>>>> There was testing database on it. There were alot of structural >>>>>>> changes, view updates and so on. >>>>>>> Than it becomes production and starts working ok. >>>>>>> Than we realize we need backup, and best - online backup (as we hav= e >>>>>>> couchdb we can do this). >>>>>>> So, there appears node1 server with couchdb 1.0.1. I replicated nod= e2 >>>>>>> to node1, than initiates continuous replication node1 -> node2 and >>>>>>> node2 -> node1. All clients works with node2 only. All works fine >>>>>>> about a month. >>>>>>> Few days before we was at peak load, so I'v want to use node1 and >>>>>>> node2 simultaneously. This was done by round-robin on DNS (host db >>>>>>> returns 2 different IP - node1's ip and node2's IP). All works fine >>>>>>> about 5 minutes, than I gave first conflict (view queues/all return= s >>>>>>> two identical documents, one - actual version, second - conflicted >>>>>>> revision, document with field _conflict=3D"....."). Document ID was >>>>>>> q_tsentr. >>>>>>> As I don't has conflict resolver yet, I resolves conflict manually = by >>>>>>> deleting conflicted revision. I'v also disables round-robin and mov= e >>>>>>> all load to node2 to avoid conflicts for a while to wrote conflict >>>>>>> resolver. >>>>>>> >>>>>>> It works ok (node1 and node2 in mutual replications, active load on >>>>>>> node2) till yesterday. >>>>>>> Yesterday operator call me he has duplicate data in program. At thi= s >>>>>>> queues/all returns 1 duplicated document - the same as few days bef= ore >>>>>>> (id =3D q_tsentr). One row consists of actual document version, ano= ther >>>>>>> row consists of old revision with field _conflicted_revision=3D"som= e old >>>>>>> revision". >>>>>>> >>>>>>> I tried to delete this revision but without success. GET for >>>>>>> q_tsentr?rev=3D"some old revision" returns valid document. DELETE >>>>>>> q_tsentr?rev=3D"some old revision" gaves me 409 error. >>>>>>> Here are log files (node2): >>>>>>> >>>>>>> [Wed, 06 Oct 2010 12:17:19 GMT] [info] [<0.7239.1462>] 10.0.0.41 - = - >>>>>>> 'GET' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2e22= 9 200 >>>>>>> [Wed, 06 Oct 2010 12:17:30 GMT] [info] [<0.7245.1462>] 10.0.0.41 - = - >>>>>>> 'GET' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2e22= 9 200 >>>>>>> [Wed, 06 Oct 2010 12:17:35 GMT] [info] [<0.7287.1462>] 10.0.0.41 - = - >>>>>>> 'GET' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2e22= 9 200 >>>>>>> [Wed, 06 Oct 2010 12:17:43 GMT] [info] [<0.7345.1462>] 10.0.0.41 - = - >>>>>>> 'GET' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2e22= 9 200 >>>>>>> [Wed, 06 Oct 2010 12:18:02 GMT] [info] [<0.7864.1462>] 10.0.0.41 - = - >>>>>>> 'DELETE' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2= e229 >>>>>>> 409 >>>>>>> [Wed, 06 Oct 2010 12:18:29 GMT] [info] [<0.8331.1462>] 10.0.0.41 - = - >>>>>>> 'GET' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2e22= 9 200 >>>>>>> [Wed, 06 Oct 2010 12:18:39 GMT] [info] [<0.8363.1462>] 10.0.0.41 - = - >>>>>>> 'DELETE' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2= e229 >>>>>>> 409 >>>>>>> [Wed, 06 Oct 2010 12:38:19 GMT] [info] [<0.16765.1462>] 10.0.0.41 -= - >>>>>>> 'GET' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2e22= 9 200 >>>>>>> [Wed, 06 Oct 2010 12:40:40 GMT] [info] [<0.17337.1462>] 10.0.0.41 -= - >>>>>>> 'GET' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2e22= 9 200 >>>>>>> [Wed, 06 Oct 2010 12:40:45 GMT] [info] [<0.17344.1462>] 10.0.0.41 -= - >>>>>>> 'DELETE' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2= e229 >>>>>>> 404 >>>>>>> >>>>>>> Logs at node1: >>>>>>> >>>>>>> [Wed, 06 Oct 2010 12:17:46 GMT] [info] [<0.25979.462>] 10.20.20.13 = - - >>>>>>> 'GET' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2e22= 9 200 >>>>>>> [Wed, 06 Oct 2010 12:17:56 GMT] [info] [<0.26002.462>] 10.20.20.13 = - - >>>>>>> 'DELETE' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2= e229 >>>>>>> 200 >>>>>>> [Wed, 06 Oct 2010 12:21:25 GMT] [info] [<0.27133.462>] 10.20.20.13 = - - >>>>>>> 'DELETE' /exhaust/q_tsentr?rev=3Dall 404 >>>>>>> [Wed, 06 Oct 2010 12:21:49 GMT] [info] [<0.27179.462>] 10.20.20.13 = - - >>>>>>> 'DELETE' /exhaust/q_tsentr?revs=3Dtrue 404 >>>>>>> [Wed, 06 Oct 2010 12:24:41 GMT] [info] [<0.28959.462>] 10.20.20.13 = - - >>>>>>> 'DELETE' /exhaust/q_tsentr?revs=3Dtrue 404 >>>>>>> [Wed, 06 Oct 2010 12:38:07 GMT] [info] [<0.10362.463>] 10.20.20.13 = - - >>>>>>> 'GET' /exhaust/q_tsentr?revs=3Dall 404 >>>>>>> [Wed, 06 Oct 2010 12:38:23 GMT] [info] [<0.10534.463>] 10.20.20.13 = - - >>>>>>> 'GET' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2e22= 9 200 >>>>>>> [Wed, 06 Oct 2010 12:40:25 GMT] [info] [<0.12014.463>] 10.20.20.13 = - - >>>>>>> 'GET' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2e22= 9 200 >>>>>>> [Wed, 06 Oct 2010 12:40:33 GMT] [info] [<0.12109.463>] 10.20.20.13 = - - >>>>>>> 'DELETE' /exhaust/q_tsentr?rev=3D27144-f516ac68e697874eef9c7562f3e2= e229 >>>>>>> 404 >>>>>>> >>>>>>> So, I deletes this document and creates new one (id - q_tsentr2). >>>>>>> It will works fine about hour. >>>>>>> >>>>>>> Node2 has undeletable duplicate, so I move all clients to node1. Th= ere >>>>>>> were now such problem, view response was correct. >>>>>>> >>>>>>> Than I tried to recover database at node2. I stops, deletes view in= dex >>>>>>> files and start couchdb again. Than i ping all view to recreate ind= ex. >>>>>>> At the end ot this procedure, i saw duplicates of identical rows (s= ee >>>>>>> first letter in this thread). Node1 has no such problems, so I stop= s >>>>>>> replication, leave load on node1 and go for crying into this mailli= st. >>>>>>> >>>>>>> >>>>>>> 2010/10/6 Paul Davis : >>>>>>>> It was noted on IRC that I should give a bit more explanation. >>>>>>>> >>>>>>>> With the information that you've provided there are two possible >>>>>>>> explanations. Either your client code is not doing what you expect= or >>>>>>>> you've triggered a really crazy bug in the view indexer that cause= d it >>>>>>>> to reindex a database without invalidating a view and not removing >>>>>>>> keys for docs when it reindexed. >>>>>>>> >>>>>>>> Given that no one has reported anything remotely like this and I c= an't >>>>>>>> immediately see a code path that would violate so many behaviours = in >>>>>>>> the view updater, I'm leaning towards this being an issue in the >>>>>>>> client code. >>>>>>>> >>>>>>>> If there was something specific that changed since the view worked= , >>>>>>>> that might illuminate what could cause this sort of behaviour if i= t is >>>>>>>> indeed a bug in CouchDB. >>>>>>>> >>>>>>>> HTH, >>>>>>>> Paul Davis >>>>>>>> >>>>>>>> On Wed, Oct 6, 2010 at 12:24 PM, Alexey Loshkarev wrote: >>>>>>>>> I have such view function (map only, without reduce) >>>>>>>>> >>>>>>>>> function(doc) { >>>>>>>>> =A0if (doc.type =3D=3D "queue") { >>>>>>>>> =A0 =A0emit(doc.ordering, doc.drivers); >>>>>>>>> =A0} >>>>>>>>> } >>>>>>>>> >>>>>>>>> It works perfect till yesterday, but today it start return duplic= ates >>>>>>>>> Example: >>>>>>>>> $ curl http://node2:5984/exhaust/_design/queues/_view/all >>>>>>>>> >>>>>>>>> {"total_rows":46,"offset":0,"rows":[ >>>>>>>>> {"id":"q_mashinyi-v-gorode","key":0,"value":["d_mironets_ivan","d= _smertin_ivan","d_kasyanenko_sergej","d_chabotar_aleksandr","d_martyinenko_= yurij","d_krikunenko_aleksandr"]}, >>>>>>>>> {"id":"q_mashinyi-v-gorode","key":0,"value":["d_mironets_ivan","d= _smertin_ivan","d_kasyanenko_sergej","d_chabotar_aleksandr","d_martyinenko_= yurij","d_krikunenko_aleksandr"]}, >>>>>>>>> {"id":"q_mashinyi-v-gorode","key":0,"value":["d_mironets_ivan","d= _smertin_ivan","d_kasyanenko_sergej","d_chabotar_aleksandr","d_martyinenko_= yurij","d_krikunenko_aleksandr"]}, >>>>>>>>> ...... >>>>>>>>> {"id":"q_oblasnaya","key":2,"value":["d_kramarenko_viktor","d_sko= rodzievskij_eduard"]}, >>>>>>>>> {"id":"q_oblasnaya","key":2,"value":["d_kramarenko_viktor","d_sko= rodzievskij_eduard"]}, >>>>>>>>> {"id":"q_oblasnaya","key":2,"value":["d_kramarenko_viktor","d_sko= rodzievskij_eduard"]}, >>>>>>>>> ........ >>>>>>>>> {"id":"q_otstoj","key":11,"value":["d_gavrilenko_aleksandr","d_kl= ishnev_sergej"]} >>>>>>>>> ]} >>>>>>>>> >>>>>>>>> >>>>>>>>> I tried to restart server, recreate view (remove view index file)= , >>>>>>>>> compact view and database and none of this helps, it still return= s >>>>>>>>> duplicates. >>>>>>>>> What happens? How to avoid it in the future? >>>>>>>>> >>>>>>>>> -- >>>>>>>>> ---------------- >>>>>>>>> Best regards >>>>>>>>> Alexey Loshkarev >>>>>>>>> mailto:elf2001@gmail.com >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> ---------------- >>>>>>> Best regards >>>>>>> Alexey Loshkarev >>>>>>> mailto:elf2001@gmail.com >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> ---------------- >>>>>> Best regards >>>>>> Alexey Loshkarev >>>>>> mailto:elf2001@gmail.com >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> ---------------- >>>>> Best regards >>>>> Alexey Loshkarev >>>>> mailto:elf2001@gmail.com >>>>> >>>> >>>> >>>> >>>> -- >>>> ---------------- >>>> Best regards >>>> Alexey Loshkarev >>>> mailto:elf2001@gmail.com >>>> >>> >>> >>> >>> -- >>> ---------------- >>> Best regards >>> Alexey Loshkarev >>> mailto:elf2001@gmail.com >>> >> > > > > -- > ---------------- > Best regards > Alexey Loshkarev > mailto:elf2001@gmail.com > --=20 ---------------- Best regards Alexey Loshkarev mailto:elf2001@gmail.com