couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Klo <>
Subject Re: Scaling with filtered replication
Date Tue, 09 Jul 2013 16:01:40 GMT
Not sure about recent builds, but older pre-1.2, had a weird timeout problem if the time between
processing a range of documents that were filtered vs not filtered was too great. I think
this was the heartbeat problem.

i.e. we had about 800k docs and in some ranges filtering would eliminate about 100k docs sequentially.
If it took too long to process those 100k docs, replication would stall/timeout. FWIW, this
was a different from a "MapReduce taking too long error".

I believe this is fixed though. Others could confirm, as we modified our data design to minimize
this problem.

Jim Klo
Senior Software Engineer
SRI International
t: @nsomnac

On Jul 9, 2013, at 8:37 AM, "Bill Foshay" <<>>

I was reading somewhere recently that filtered replication with couchdb
doesn't scale well and I was wondering if someone could verify whether this
was true and if so, is there was a better way for us to architect our backend?
Our company currently has a central couchdb on Iris Couch that houses all of
our clients' data. Each of our clients also have their own couchdb, that
replicates with this central db. The client dbs pull with a persistent
filtered replication (so that they only pull their domain data, and they only
pull the last two weeks worth of report data). They also have a persistent
push replication set up to the central db. While the central db contains all
domain and historical data, the individual client dbs only contain their
domain data and the last two weeks worth of report related data. Each client
generates about 5GB of data per year (roughly 100000 docs and 300000 doc
updates). We only have a few clients at this point so we haven't really
noticed any problems but if this design is going to have problems scaling, I'd
rather hold off on sales and make changes now. Is there a flaw in this
approach or a better way to do things? I appreciate any help or advice!


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message