Ah, Mike, I didn't get the instructions right in step 1. Sorry about that. What you really want are the last 1000 Ids in the seq_tree prior to the compactor crash. So maybe something like GET /iris/_changes?descending=true&limit=1000&since=96282148 Regards, Adam On Jan 2, 2011, at 12:43 AM, mike@loop.com.br wrote: > Adam, > > Thanks for an excellent explanation. It was easy to find the culprit: > > curl -s '172.17.17.3:5984/iris/_changes?since=96281148&limit=1000&include_docs=true' | grep -v time > {"results":[ > {"seq":96281622,"id":"1292252400F7005","changes":[{"rev":"2-d94be4c93931a35524b3f34b9de41a11"}],"deleted":true,"doc":{"_id":"1292252400F7005","_rev":"2-d94be4c93931a35524b3f34b9de41a11","_deleted":true}}, > ], > "last_seq":96282306} > > The problem I have is that the document exists with different rev and is not > deleted: > > curl -s '172.17.17.3:5984/iris/1292252400F7005' > {"_id":"1292252400F7005","_rev":"1-74a74942107db308d42864e50c1517aa", .... > > I deleted the document and inserted it again but the changes feed remains > the same as above - I presume the compact will still fail as before. > > Anything else I can do ? (I guess I could hack copy_docs so that not_found > is not 'fatal'). > > I am compacting regardless, maybe it'll pass..... > > Regards, > > Mike > > Citando Adam Kocoloski : > >> Ok, so this is the same error both times. As far as I can tell it indicates that the seq_tree and the id_tree indexes are out of sync; the seq_tree contains some record that isn't present in the id_tree. That's never supposed to happen, so the compactor crashes instead of trying to deal with the 'not_found' result when it does a lookup on the missing entry in the id_tree. >> >> I suspect that the _purge code is to blame, since deletions don't actually remove entries from these indexes. One thing you might try: >> >> 1) Query _changes starting from 96281148 (1000 less than the last status update) and grab the next 1000 rows >> >> 2) Figure out which of those entries are missing from the id tree, e.g. lookup the document and see if the response is {"not_found":"missing"}. You could also try using include_docs=true on the _changes feed to accomplish the same. >> >> 3) Once you've identified the problematic IDs, try creating them again. You might end up introducing duplicates in the _changes feed, but if you do there's a procedure to fix that. >> >> That's the simplest solution I can think of. Purging them again won't work because the first thing _purge does is lookup the Ids in the id_tree. Regards, >> >> Adam >> >> On Jan 1, 2011, at 9:47 AM, mike@loop.com.br wrote: >> >>> I did the same with the tagged 1.0.1. Attached is >>> the error produced. My responses are below: >>> >>> Citando Robert Newson : >>> >>>> Some more info would help here. >>>> >>>> 1) How far did compaction get? >>> It gets to seq 96282148 of 109105202 ie: 88% >>> >>>> 2) Do you have enough spare disk space? >>> Yes I have lots of free space :-) >>> >>>> 3) What commit of 1.0.x were you running before you moved to 08d71849? >>> I was using Dec 13 852fa047. Before that something at least a month old. >>> >>>> B. >>>> >>>> On Fri, Dec 31, 2010 at 3:55 PM, Robert Newson wrote: >>>>> Can you try this with a tagged release like 1.0.1? >>>>> >>>>> On Fri, Dec 31, 2010 at 3:38 PM, wrote: >>>>>> Hello, >>>>>> >>>>>> Hoping for some guidance. I have a rather large (295Gb) database that was >>>>>> created >>>>>> running 1.0.x and I am pretty certain that there is no corruption - It has >>>>>> always >>>>>> been on a clean ZFS volume. >>>>>> >>>>>> I upgraded to 1.0.x (08d71849464a8e1cc869b385591fa00b3ad0f843 git) in the >>>>>> hope >>>>>> that it may resolve the issue. >>>>>> >>>>>> I have previously '_purge'd many douments from this database previously, so >>>>>> that may be relevant. >>>>>> >>>>>> I am annexing the error from couchdb.log >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Mike >>>>>> >>>>> >>>> >>> >>> >>> >> >> > > >