couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Blakey <antony.bla...@gmail.com>
Subject Re: couchdb
Date Sun, 21 Dec 2008 02:12:16 GMT
Posted to couchdb-users because this might be of general interest.

On 21/12/2008, at 11:28 AM, Tim Parkin wrote:

> Hi Anthony,
>
> in short..  I'm interested in your emails about CouchDB relations and
> _externals/update notifiers.. in longer ..
>
> I'm following CouchDB at the moment and an app I'm building works with
> 'related' data. My methodology is basically a form of view cacheing..
>
> i.e. I have a photogrpahic holiday app..
>
> I have a db where I have a few different documents, some marked as
> 'tours', some marked as 'leaders'.
>
> When I get a tour out, I'd like some information abou the tour leader.
>
> I've created a view which gives me that information (e.g. Gets first
> name and last name)
>
> I've then tied in the update process so thatwhen the leaders first  
> name
> or last name changes, I cache this view on the tour.
>
> This allows me to get a bunch of data I need for tours in one go  
> without
> getting the tour leader out (OK this is a fairly trivial example but
> some tours have three assistants and other 'remote' attributes).
>
> I'd really like to have this trigger process happen on the server but
> haven't worked out how to do it.. I've read a lot of your posts and  
> you
> seem to have approached problem(s) like this and I'd be interested in
> your recommmendation.
>
> At the moment I'm looking at triggering an _external from an update
> notification (rather like your post
> http://markmail.org/message/wikw4vggx4uew5kz?q=couchdb+_external&page=1&refer=uwkon7i337iq6yxf)

I don't recommend having the notification trigger anything but setting  
a dirty-flag. I've tried to get a response from the gatekeepers about  
the intended synchronization contract between update notifications and  
_external request, to no avail. IMO without that being synchronous,  
it's not that useful.

I do it all in the _external, which is good because it then has the  
same lazy-update characteristics as a view. A test handler follows.  
This is in ruby just because I'm doing some ruby at the moment. I use  
SQLite as the index database because it's in-process and really easy.

The result I return from the _external call is not structured like a  
view result in this code (it's just test code)

The 'src' field in the 'links' table happens to be the doc_id, so I  
can remove entries when I notice a deletion. Without that, I would  
have to include the doc_id separately in the table so that I could  
handle the deletion of documents. I cheat for this system and use a  
domain-value key (the email address of the user). I do this because I  
use a single-writer peer to get a cheap uniqueness constraint. For my  
app, read-only replication is sufficient USP.

I'm not accepting any parameters in my _external request - I return  
the entire db (it's just test code).

If you look at Paul Davis's recent post about _external you can see  
that the query parameters to the _external request are supplied in the  
per-request json, so dealing with that is trivial.

This code was written to use my own _external erlang code - I think  
the db_name and update_seq fields are nested in some other structure  
in the newly-official _external handler, and you'll need a HEAD  
version of couchdb to make this work in any case, because you need the  
update_seq.

Anyway, here is some code

---------------------------------------------------------------------

require 'rubygems'
require 'json'
require 'couchrest'
require 'cgi'
require 'sqlite3'

# Need a command line parameter in the .ini rather than hardcoding this
db = SQLite3::Database.new("/usr/local/var/lib/couchdb/bims-users- 
index.db")

sql = <<SQL
   create table if not exists seqnums (db TEXT PRIMARY KEY, seq_num  
INTEGER);
   create table if not exists links (db TEXT, src TEXT, dest TEXT,  
PRIMARY KEY (db, src, dest));
SQL

db.execute_batch(sql)

seq_nos = {}
db.execute("SELECT db, seq_num FROM seqnums") do | row |
   seq_nos[row[0]] = row[1].to_i
end

while line = gets

   data = JSON.parse(line)
   db_name = data["db_name"]
   seq = data["update_seq"].to_i
   current_seq = seq_nos[db_name] || 0

   if current_seq < seq

     STDERR.puts 'Start index update ' + db_name

     current_seq.step(seq, 500) do | offset |

       result = CouchRest::get('http://localhost:5984/' + db_name + '/ 
_all_docs_by_seq?include_docs=true&startkey=' + offset.to_s +  
"&endkey=" + (offset + 500).to_s)

       result["rows"].each |row|
         doc_id = row["id"]
         doc = row["doc"]
         STDERR.puts 'Process ' + doc_id
         if obj["value"]["deleted"]
           db.execute("delete from links where db = ? and src = ?",  
db_name, doc_id);
         elsif doc["Type"] == "user"
           doc["Roles"] && doc["Roles"].each do |r|
             db.execute("replace into links values (?, ?, ?)",  
db_name, doc_id, r);
           end
         elsif doc["Type"] == "role"
           doc["Permissions"] && doc["Permissions"].each |p|
             db.execute("replace into links values (?, ?, ?)",  
db_name, doc_id, p);
           end
         end
       end
     end

     seq_nos[db_name] = seq
     db.execute("replace into seqnums values (?, ?)", db_name, seq);
     STDERR.puts 'Update index of database ' + db_name + ' to seq# ' +  
seq.to_s

   end

   rows = db.execute(" SELECT DISTINCT t1.src, t2.dest
                FROM links as t1, links as t2
                WHERE t1.dest = t2.src AND t1.db = ? AND t2.db = ?",  
db_name, db_name)

   result = { "code" => 200, "json" => rows }

   STDOUT.puts result.to_json
   STDOUT.flush

end

---------------------------------------------------------------------

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

It is no measure of health to be well adjusted to a profoundly sick  
society.
   -- Jiddu Krishnamurti



Mime
View raw message