couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eric casteleijn <>
Subject Big problems with writes to _config
Date Tue, 27 Oct 2009 23:49:44 GMT
We have the following setup:

2 near identical public facing django servers communicating with one 
couchdb server. The couchdb server is oauth authenticated and people can 
access it directly (well, through an apache proxy) if they have the 
tokens to do so. New users are signed up through these django servers, 
after which they add the user and their tokens to couchdb. (the user 
through a POST to _users and the tokens through PUTs to _config)

We see this failing a lot, now to the point where we think it fails all 
the time (since all those systems have separate logs not all of which we 
have access to, this is not trivial to piece together.)

The errors the API servers get back all look like these (the lines 
starting with '(500':

'2009-10-27 22:35:15,357 ERROR    UbuntuOne.couch: failed to add ***** = 
40693 to section [oauth_token_users] of local.ini:

(500, (u'timeout', u'{gen_server,call,\n            [couch_config,\n 

'2009-10-27 22:35:20,399 ERROR    UbuntuOne.couch: failed to add ***** = 
***** to section [oauth_token_secrets] of local.ini:

(500, (u'timeout', u'{gen_server,call,\n            [couch_config,\n 
"*****",\n                  true}]}'))'

Corresponding errors in the couchdb.log look like:

My theory was that these writes to _config fail because the local.ini is 
somehow corrupted, but I can't access that file directly (since it has 
users' secrets) or copy it to my machine to test this theory, and 
helping someone who is allowed to see it look for anything weird is like 
searching for the proverbial needle in the haystack: we have lots of 
users, and users can have multiple tokens. Add to that the fact that you 
cannot ever delete a line from the .ini file (DELETEs against keys in 
_config just empty the value and leave a line like 'foo = \n'!

After speaking to Jan on the channel he proposed that it may be that the 
gen_server message inbox overflows and the gen_server times out.

Could that be, under high load, and how can we solve this? Can we 
increase the size of this inbox, or can we possibly have multiple 
processes handling the access? Whether it's high load or corruption or 
something else again, right now it looks like NO new tokens can be 
added, and hence no new users can use our system. In short: HALP!

- eric casteleijn

View raw message