From user-return-14375-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Sun Jan 02 00:55:54 2011 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 80106 invoked from network); 2 Jan 2011 00:55:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Jan 2011 00:55:54 -0000 Received: (qmail 21561 invoked by uid 500); 2 Jan 2011 00:55:52 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 21519 invoked by uid 500); 2 Jan 2011 00:55:52 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 21511 invoked by uid 99); 2 Jan 2011 00:55:52 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 02 Jan 2011 00:55:52 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of adam.kocoloski@gmail.com designates 209.85.216.173 as permitted sender) Received: from [209.85.216.173] (HELO mail-qy0-f173.google.com) (209.85.216.173) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 02 Jan 2011 00:55:46 +0000 Received: by qyk1 with SMTP id 1so14004437qyk.11 for ; Sat, 01 Jan 2011 16:55:25 -0800 (PST) Received: by 10.229.89.84 with SMTP id d20mr3276379qcm.132.1293929725234; Sat, 01 Jan 2011 16:55:25 -0800 (PST) Received: from [10.0.1.2] (c-71-232-49-44.hsd1.ma.comcast.net [71.232.49.44]) by mx.google.com with ESMTPS id nb15sm10087246qcb.38.2011.01.01.16.55.17 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 01 Jan 2011 16:55:23 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1082) Subject: Re: Compact not completing From: Adam Kocoloski In-Reply-To: <20110101114721.vyoefauts0woc8gg@webmail.loop.com.br> Date: Sat, 1 Jan 2011 19:55:11 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20101231123818.dehfe3vdq8ww8csg@webmail.loop.com.br> <20110101114721.vyoefauts0woc8gg@webmail.loop.com.br> To: user@couchdb.apache.org X-Mailer: Apple Mail (2.1082) Ok, so this is the same error both times. As far as I can tell it = indicates that the seq_tree and the id_tree indexes are out of sync; the = seq_tree contains some record that isn't present in the id_tree. That's = never supposed to happen, so the compactor crashes instead of trying to = deal with the 'not_found' result when it does a lookup on the missing = entry in the id_tree. I suspect that the _purge code is to blame, since deletions don't = actually remove entries from these indexes. One thing you might try: 1) Query _changes starting from 96281148 (1000 less than the last status = update) and grab the next 1000 rows 2) Figure out which of those entries are missing from the id tree, e.g. = lookup the document and see if the response is {"not_found":"missing"}. = You could also try using include_docs=3Dtrue on the _changes feed to = accomplish the same. 3) Once you've identified the problematic IDs, try creating them again. = You might end up introducing duplicates in the _changes feed, but if you = do there's a procedure to fix that. That's the simplest solution I can think of. Purging them again won't = work because the first thing _purge does is lookup the Ids in the = id_tree. Regards, Adam On Jan 1, 2011, at 9:47 AM, mike@loop.com.br wrote: > I did the same with the tagged 1.0.1. Attached is > the error produced. My responses are below: >=20 > Citando Robert Newson : >=20 >> Some more info would help here. >>=20 >> 1) How far did compaction get? > It gets to seq 96282148 of 109105202 ie: 88% >=20 >> 2) Do you have enough spare disk space? > Yes I have lots of free space :-) >=20 >> 3) What commit of 1.0.x were you running before you moved to = 08d71849? > I was using Dec 13 852fa047. Before that something at least a month = old. >=20 >> B. >>=20 >> On Fri, Dec 31, 2010 at 3:55 PM, Robert Newson = wrote: >>> Can you try this with a tagged release like 1.0.1? >>>=20 >>> On Fri, Dec 31, 2010 at 3:38 PM, wrote: >>>> Hello, >>>>=20 >>>> Hoping for some guidance. I have a rather large (295Gb) database = that was >>>> created >>>> running 1.0.x and I am pretty certain that there is no corruption - = It has >>>> always >>>> been on a clean ZFS volume. >>>>=20 >>>> I upgraded to 1.0.x (08d71849464a8e1cc869b385591fa00b3ad0f843 git) = in the >>>> hope >>>> that it may resolve the issue. >>>>=20 >>>> I have previously '_purge'd many douments from this database = previously, so >>>> that may be relevant. >>>>=20 >>>> I am annexing the error from couchdb.log >>>>=20 >>>> Thanks, >>>>=20 >>>> Mike >>>>=20 >>>=20 >>=20 >=20 >=20 >