incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Vander Wilt <nate-li...@calftrail.com>
Subject Re: couchdb crashes silently
Date Fri, 01 Nov 2013 16:17:26 GMT

On Nov 1, 2013, at 12:10 AM, Dave Cottlehuber <dch@jsonified.com> wrote:

>> On Oct 31, 2013, at 5:13 PM, Nathan Vander Wilt > 
>> wrote:
>> 
>> Aaaand my Couch commited suicide again today. Unless this is  
>> something different, I may have finally gotten lucky and had  
>> CouchDB leave a note [eerily unfinished!] in the logs this time:  
>> https://gist.github.com/natevw/fd509978516499ba128b  
>> 
>> ```
>> ** Reason == {badarg,
>> [{io,put_chars,
>> [<0.93.0>,unicode,
>> <<"[Thu, 31 Oct 2013 19:48:48 GMT] [info] [<0.31789.2>] 66.249.66.216
 
>> - - GET /public/_design/glob/_list/posts/by_path?key=%5B%222012%22%2C%2203%22%2C%22metakaolin_geojson_editor%22%5D&include_docs=true&path1=2012&path2=03&path3=metakaolin_geojson_editor
 
>> 200\n">>],
>> []},
>> ```
>> 
>> So…now what? I have a rebuilt version of CouchDB I'm going to try  
>> [once I figure out why *it* isn't starting] but this is still really  
>> upsetting — I'm aware I could add my own cronjob or something to  
>> check and restart if needed every minute, but a) the shell script  
>> is SUPPOSED to be keeping CouchDB and b) it's NOT and c) this is  
>> embarrassing and aggravating.
>> 
>> thanks,
>> -natevw
> 
> So there’s 2 things here
> 
> - why the couch doesn’t get restarted?
> 
> Sounds very much like the afore mentioned pid race condition. Wendall do you know any
more about this? I thought you had some ideas about it IIRC.
> 


I think I figured out the answer to this one, at least in the latest crash. The Erlang process
the shell script watches was still running, just not accepting connections. I didn't notice
this the previous times, though…I only realized it this time because when I went to restart
the shell script acted like it was already running. So maybe there's actually two crashes,
one silent heartbeat one and this unicode?



> - why io:putchars/2 has trouble writing to a boring log file, which obviously works most
of the time.
> 
> <0.93.0>,unicode, <<"[Thu, 31 Oct 2013 19:48:48 GMT...”>>
> 
> io:put_chars(Fd, unicode, <<Binary>>) doesn’t look right — there’s
no io:put_chars/3. 
> 
> This unicode looks weird and from a quick look I can’t see where it should come from.
> 
> Can you get more of the logfile (like hundreds of lines) and stick it somewhere? email
is fine.
> 
> I’d like to see what happens to <0.93.0> (the process wrapping the log fd), and
also if the unicode atom turns up anywhere else prior.


You want more of the log *up to* the crash? Because I have nothing *beyond* what is in that
gist, that's the thing! The end of the log was cut off, I did not snip it. The log as it sits
now has these exact lines in it:

```
                             {line,173}]},
                           {gen_event,ser
Apache CouchDB 1.4.0 (LogLevel=info) is starting.
```

(The subsequent "starting" is due to my intervention.)

-nvw




Mime
View raw message