incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: Compact not completing
Date Sun, 02 Jan 2011 22:06:36 GMT
Ah, Mike, I didn't get the instructions right in step 1.  Sorry about that.  What you really
want are the last 1000 Ids in the seq_tree prior to the compactor crash.  So maybe something
like

GET /iris/_changes?descending=true&limit=1000&since=96282148

Regards, Adam

On Jan 2, 2011, at 12:43 AM, mike@loop.com.br wrote:

> Adam,
> 
> Thanks for an excellent explanation. It was easy to find the culprit:
> 
> curl -s '172.17.17.3:5984/iris/_changes?since=96281148&limit=1000&include_docs=true'
| grep -v time
> {"results":[
> {"seq":96281622,"id":"1292252400F7005","changes":[{"rev":"2-d94be4c93931a35524b3f34b9de41a11"}],"deleted":true,"doc":{"_id":"1292252400F7005","_rev":"2-d94be4c93931a35524b3f34b9de41a11","_deleted":true}},
> ],
> "last_seq":96282306}
> 
> The problem I have is that the document exists with different rev and is not
> deleted:
> 
> curl -s '172.17.17.3:5984/iris/1292252400F7005'
> {"_id":"1292252400F7005","_rev":"1-74a74942107db308d42864e50c1517aa", ....
> 
> I deleted the document and inserted it again but the changes feed remains
> the same as above - I presume the compact will still fail as before.
> 
> Anything else I can do ? (I guess I could hack copy_docs so that not_found
> is not 'fatal').
> 
> I am compacting regardless, maybe it'll pass.....
> 
> Regards,
> 
> Mike
> 
> Citando Adam Kocoloski <kocolosk@apache.org>:
> 
>> Ok, so this is the same error both times.  As far as I can tell it  indicates that
the seq_tree and the id_tree indexes are out of sync;  the seq_tree contains some record that
isn't present in the id_tree.   That's never supposed to happen, so the compactor crashes
instead  of trying to deal with the 'not_found' result when it does a lookup  on the missing
entry in the id_tree.
>> 
>> I suspect that the _purge code is to blame, since deletions don't  actually remove
entries from these indexes.  One thing you might try:
>> 
>> 1) Query _changes starting from 96281148 (1000 less than the last  status update)
and grab the next 1000 rows
>> 
>> 2) Figure out which of those entries are missing from the id tree,  e.g. lookup the
document and see if the response is  {"not_found":"missing"}.  You could also try using include_docs=true
 on the _changes feed to accomplish the same.
>> 
>> 3) Once you've identified the problematic IDs, try creating them  again.  You might
end up introducing duplicates in the _changes  feed, but if you do there's a procedure to
fix that.
>> 
>> That's the simplest solution I can think of.  Purging them again  won't work because
the first thing _purge does is lookup the Ids in  the id_tree.  Regards,
>> 
>> Adam
>> 
>> On Jan 1, 2011, at 9:47 AM, mike@loop.com.br wrote:
>> 
>>> I did the same with the tagged 1.0.1. Attached is
>>> the error produced. My responses are below:
>>> 
>>> Citando Robert Newson <robert.newson@gmail.com>:
>>> 
>>>> Some more info would help here.
>>>> 
>>>> 1) How far did compaction get?
>>> It gets to seq 96282148 of 109105202 ie: 88%
>>> 
>>>> 2) Do you have enough spare disk space?
>>> Yes I have lots of free space :-)
>>> 
>>>> 3) What commit of 1.0.x were you running before you moved to 08d71849?
>>> I was using Dec 13 852fa047. Before that something at least a month old.
>>> 
>>>> B.
>>>> 
>>>> On Fri, Dec 31, 2010 at 3:55 PM, Robert Newson   <robert.newson@gmail.com>
wrote:
>>>>> Can you try this with a tagged release like 1.0.1?
>>>>> 
>>>>> On Fri, Dec 31, 2010 at 3:38 PM,  <mike@loop.com.br> wrote:
>>>>>> Hello,
>>>>>> 
>>>>>> Hoping for some guidance. I have a rather large (295Gb) database
that was
>>>>>> created
>>>>>> running 1.0.x and I am pretty certain that there is no  corruption
- It has
>>>>>> always
>>>>>> been on a clean ZFS volume.
>>>>>> 
>>>>>> I upgraded to 1.0.x (08d71849464a8e1cc869b385591fa00b3ad0f843 git)
in the
>>>>>> hope
>>>>>> that it may resolve the issue.
>>>>>> 
>>>>>> I have previously '_purge'd many douments from this database  previously,
so
>>>>>> that may be relevant.
>>>>>> 
>>>>>> I am annexing the error from couchdb.log
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Mike
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> <error2.log>
>> 
>> 
> 
> 
> 


Mime
View raw message