couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Kocoloski (JIRA)" <>
Subject [jira] Commented: (COUCHDB-761) Timeouts in couch_log are masked, crashes callers
Date Mon, 17 May 2010 11:02:42 GMT


Adam Kocoloski commented on COUCHDB-761:

I haven't tested it, but the patch looks correct.

Since we're moving away from the *_report functions we could also simplify the messages a
bit.  For example, instead of

gen_event:sync_notify(error_logger, {info_report, group_leader(), {self(), couch_info, {Format,

we could do

gen_event:sync_notify(error_logger, {couch, info, self(), Format, Args});

or, since we're doing sync logging, we might even partially format the message in the sending
process and make it a little simpler to copy:

gen_event:sync_notify(error_logger, {couch, info, self(), io_lib:format(Format, Args)});

Randall, what do you think?

> Timeouts in couch_log are masked, crashes callers
> -------------------------------------------------
>                 Key: COUCHDB-761
>                 URL:
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>    Affects Versions: 0.10.1, 0.10.2, 0.11
>            Reporter: Randall Leeds
>            Priority: Blocker
>             Fix For: 0.10.3, 0.11.1, 1.0
>         Attachments: improved-sync-logging.patch
> Several users have reported seeing crash reports stemming from a function_clause match
on handle_info in various gen_servers. The offending message looks like {#Ref<>, <integer>}.
> After months of banter and sleuthing, I determined that the likely cause was a late reply
to a gen_server:call that timed out, with the #Ref being the tag on the response. After it
came up again today in IRC, kocolosk quickly discovered that the problem appears to be in
> The logging macros (?LOG_*)  call couch_log/*_on which calls get_level_integer/0. When
this call times out the timeout is eaten and a late reply arrives to the calling process later,
triggering the crash.
> Suggestions on how to fix this welcome. Ideas so far are async logging or infinite timeout.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message