couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Isaac Z. Schlueter (JIRA)" <>
Subject [jira] [Commented] (COUCHDB-2236) Weird _users doc conflict when replicating from 1.5.0 -> 1.6.0
Date Tue, 13 May 2014 03:40:15 GMT


Isaac Z. Schlueter commented on COUCHDB-2236:

The only thing that I can think of is that we had a bug in the script that deleted the conflicted
revs, and I guess deleted the wrong side in the conflict, causing it to be conflicted again.

Like I said, the problematic host is out of rotation, so this isn't a pressing issue for us.
 We'll keep an eye on it if we roll out any other 1.6 hosts.  Feel free to close this for
now, I guess.

> Weird _users doc conflict when replicating from 1.5.0 -> 1.6.0
> --------------------------------------------------------------
>                 Key: COUCHDB-2236
>                 URL:
>             Project: CouchDB
>          Issue Type: Bug
>      Security Level: public(Regular issues) 
>            Reporter: Isaac Z. Schlueter
> The upstream write-master for npm is a CouchDB 1.5.0.  (Since it is locked down at the
IP level, we're not at risk to the DOS fixed in 1.5.1.)
> All PUT/POST/DELETE requests are routed to this master box, as well as any request with
`?write=true` on the URL.  (Used for cases where we still do the PUT/409/GET/PUT dance, rather
than using a custom _update function.)
> This master box replicates to a replication hub.  The read slaves all replicate from
the replication hub.  Both the /registry and /_users databases replicate continuously using
a doc in the /_replicator database.
> As I understand it, since replication only goes in one direction, and all writes to go
the upstream master, conflicts should be impossible.
> We brought a 1.6.0 read slave online, version 1.6.0+build.fauxton-91-g5a2864b.
> On this 1.6.0 read slave (and only there), we're seeing /_users doc conflicts, and it
looks like it has a different password_sha and salt.  Here is one such example:
 (actual passowors_sha and salt mostly redacted, but enough bytes left in so that you can
see they're not matching.)
> A few weeks ago, this issue popped up, affecting about 400 user docs, and we figured
that it had to do with some instability or human error at the time when that box was set up.
 We deleted all of the conflicts, and verified that all docs matched the upstream at that
time.  We removed the /_replicator entries, and re-created them using the same script we use
to create them on all the other read slaves.
> If this was just one or two docs, or happening across more of the read slaves, I'd be
more inclined to think that it has something to do with a particular user, or our particular
setup.  However, the /_replicator docs are identical in the 1.6.0 box as on the other read
slaves.  This is affecting about 150 users, and only on that one box.
> We've taken the 1.6.0 read slave out of rotation for now, so it's not an urgent issue
for us.  If anyone wants to log in and have a look around, I can grant access, but I hope
that there's enough information here to track it down.  Thanks.

This message was sent by Atlassian JIRA

View raw message