couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Vander Wilt <>
Subject Re: couchdb crashes silently
Date Fri, 01 Nov 2013 16:17:26 GMT

On Nov 1, 2013, at 12:10 AM, Dave Cottlehuber <> wrote:

>> On Oct 31, 2013, at 5:13 PM, Nathan Vander Wilt > 
>> wrote:
>> Aaaand my Couch commited suicide again today. Unless this is  
>> something different, I may have finally gotten lucky and had  
>> CouchDB leave a note [eerily unfinished!] in the logs this time:  
>> ```
>> ** Reason == {badarg,
>> [{io,put_chars,
>> [<0.93.0>,unicode,
>> <<"[Thu, 31 Oct 2013 19:48:48 GMT] [info] [<0.31789.2>]
>> - - GET /public/_design/glob/_list/posts/by_path?key=%5B%222012%22%2C%2203%22%2C%22metakaolin_geojson_editor%22%5D&include_docs=true&path1=2012&path2=03&path3=metakaolin_geojson_editor
>> 200\n">>],
>> []},
>> ```
>> So…now what? I have a rebuilt version of CouchDB I'm going to try  
>> [once I figure out why *it* isn't starting] but this is still really  
>> upsetting — I'm aware I could add my own cronjob or something to  
>> check and restart if needed every minute, but a) the shell script  
>> is SUPPOSED to be keeping CouchDB and b) it's NOT and c) this is  
>> embarrassing and aggravating.
>> thanks,
>> -natevw
> So there’s 2 things here
> - why the couch doesn’t get restarted?
> Sounds very much like the afore mentioned pid race condition. Wendall do you know any
more about this? I thought you had some ideas about it IIRC.

I think I figured out the answer to this one, at least in the latest crash. The Erlang process
the shell script watches was still running, just not accepting connections. I didn't notice
this the previous times, though…I only realized it this time because when I went to restart
the shell script acted like it was already running. So maybe there's actually two crashes,
one silent heartbeat one and this unicode?

> - why io:putchars/2 has trouble writing to a boring log file, which obviously works most
of the time.
> <0.93.0>,unicode, <<"[Thu, 31 Oct 2013 19:48:48 GMT...”>>
> io:put_chars(Fd, unicode, <<Binary>>) doesn’t look right — there’s
no io:put_chars/3. 
> This unicode looks weird and from a quick look I can’t see where it should come from.
> Can you get more of the logfile (like hundreds of lines) and stick it somewhere? email
is fine.
> I’d like to see what happens to <0.93.0> (the process wrapping the log fd), and
also if the unicode atom turns up anywhere else prior.

You want more of the log *up to* the crash? Because I have nothing *beyond* what is in that
gist, that's the thing! The end of the log was cut off, I did not snip it. The log as it sits
now has these exact lines in it:

Apache CouchDB 1.4.0 (LogLevel=info) is starting.

(The subsequent "starting" is due to my intervention.)


View raw message