couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: 160-* etap test failure from time to time
Date Wed, 18 Aug 2010 22:15:39 GMT
Robert, thanks for digging all this out.
I'll have a closer look in the morning!

Cheers
Jan
--
On 18 Aug 2010, at 21:48, Robert Dionne wrote:

> actually as things currently stand, it no longer involves 160-* at all.
> 
> the issue as I see it is here:
> 
> http://github.com/bdionne/couchdb/blob/master/src/couchdb/couch_config.erl#L124
> 
> every call to couch_config:set  will cause registered notify_funs to be invoked, *but*,
spawned in their own pid. This implies that when they execute the original couch_config:set
may have already returned and another set processed. So there's no guarantee that the registered
notify_funs see the config state that triggered their invocation.
> 
> In this case it meant that couch_httpd was not being restarted for each config change
to vhosts. The first would trigger a restart but by the time it's gets to registering itself
again the state was already changed. The mochiweb upgrade likely slowed things enough to expose
it.
> 
> Pulling the vhost stuff into it's own gen_server made the issue go away.
> 
> A similar issue happened once before with the hash_password function in couch_server
and that's the only other place where this problem exists that I can see.
> 
> Perhaps a distinction needs to be made between config changes that require a restart
and those that don't
> 
> 
> 
> On Aug 18, 2010, at 3:35 PM, Benoit Chesneau wrote:
> 
>> On Wed, Aug 18, 2010 at 9:29 PM, Jan Lehnardt <jan@apache.org> wrote:
>>> 
>>> On 18 Aug 2010, at 20:17, Robert Dionne wrote:
>>> 
>>>> The vhosts refactoring made this issue go away. The underlying problem still
exists in couch_config. It's a race condition
>>> 
>>> The refactoring also added a whole lot of things that are separate from this
issue.
>>> 
>>> I recon the test could start couch_http and only issue requests once it is fully
up*.
>>> 
>>> *I haven't looked at any code here, just handwaving.
>>> 
>>> Cheers
>>> Jan
>>> --
>>> 
>> 
>> On *slow* machines there are other tests failing for the same reason.
>> 140- for example. 160- .
>> 
>> Even f ini was just set after the couch_server_sup start the problem
>> happened. More likely due to the fact that couch_httpd is loaded in
>> the same time and the Loop created at this time with value currently
>> in ini.
>> 
>> imo, the configuration need some love love or, maybe just the way we
>> reload couch_httpd after a change in the conf.
>> 
>> - benoit
> 


Mime
View raw message