incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Barnes <mrtr...@gmail.com>
Subject Re: Managing synchronisation with an external data source
Date Wed, 25 Nov 2009 21:38:07 GMT
Thanks, will implement something like that.

-Patrick

On 25/11/2009 7:46 PM, "Brian Candler" <B.Candler@pobox.com> wrote:

On Mon, Nov 23, 2009 at 01:10:26PM +1100, Patrick Barnes wrote: > The
external data is delivered as ...
Sounds like you need a merge. Taking users as an example:

- have a couchdb view which emits users keyed by username
- sort the incoming feed so that it is also keyed by username
- take the first record from the view and the first record from the feed

Then repeat the following:
- if they have identical usernames, skip to next in both view and feed
- if the view username < feed username, mark view record as 'inactive'
 and advance to next view record
- if the view username > feed username, create a new user in database
 and advance to next feed record

This solution uses constant RAM and scales indefinitely. Even though a
couchdb view generates a single JSON object, you can "stream" it easily
because each record within it is delimited by a newline.

OTOH, if your 200K users can be stored in an 'acceptable' amount of memory,
and you don't expect it to grow much larger, you could just read the whole
lot into RAM and process it there. At 1K per user you'd use 200MB of RAM,
which might be acceptable.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message