couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Couchdb Wiki] Update of "EntityRelationship" by WoutMertens
Date Tue, 14 Apr 2009 11:43:18 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The following page has been changed by WoutMertens:
http://wiki.apache.org/couchdb/EntityRelationship

The comment on the change is:
made some references to SQL and JOIN for googleability

------------------------------------------------------------------------------
  
  This page is mostly a translation of Google's [http://code.google.com/appengine/articles/modeling.html
Modeling Entity Relationships] article in CouchDB terms. I (WoutMertens) am mostly happy with
it but it could use more code examples and more examples of actual output. Since this is a
wiki, feel free to update this document to make things clearer, fix inaccuracies etc. This
article is also related to [http://wiki.apache.org/couchdb/Transaction_model_use_cases Transaction
model use cases] discussion, as it involves multiple document updates.
  
+ As a quick summary, this document explains how to do things that you would normally use
SQL JOIN for.
  
  == Why would I need entity relationships? ==
  Imagine you are building a snazzy new web application that includes an address book where
users can store their contacts. For each contact the user stores, you want to capture the
contacts name, birthday (which they mustn't forget!) their address, telephone number and company
they work for.
@@ -75, +76 @@

  
  If you then query this view with the ''startkey'' parameter set to "[''''''"Scott"]" and
endkey "[''''''"Scott",{}]", you'll get the contact details in the first row and the phone
numbers in the following rows (sorted by phone_type as well). You can easily extend this system
to have other types of one-to-many attributes in the same view by giving them a different
number in the view above.
  
+ This is a little bit like a JOIN in SQL although in SQL the data fields would be joined
together on a row where here they are on consecutive rows. This latter approach allows a variable
number of data fields which is more flexible than SQL.
+ 
  NOTE: This needs a code example showing how to use the output of the view. Feel free to
add one.
  
  Because CouchDB always sorts on keys, you can use this view to only get Scotts home phone
numbers by querying with ''startkey'' set to "[''''''"Scott",1,"home"]" and ''endkey'' set
to "[''''''"Scott",1,"home",{}]"
  
- When Scott loses his phone, it's easy enough to delete that record. Just delete the PhoneNumber
instance and it can no longer be queried for:
+ When Scott loses his phone, it's easy enough to delete that record. Just delete the phone
document and it can no longer be queried for:
  {{{
  $db->doc('(650) 555 - 2200')->delete;
  }}}
  
  === One to Many: Embedded Documents ===
  
- The embedded array is only an option as long as you don't have "too many" items to store,
since each document is always handled as a whole and bigger documents mean slower handling
and slower network transfers. Phone numbers should be ok unless you plan to store the whole
company phonebook in there.
+ The embedded array is only an option as long as you don't have "too many" items to store,
since each document is always handled as a whole and bigger documents mean slower handling
and slower network transfers whenever you want to change the list. Phone numbers should be
ok unless you plan to store the whole company phonebook in there.
  
  This is the easiest way to handle one-to-many as everything you need is in one place. Here's
how the document for Scott would look:
  {{{
@@ -162, +165 @@

   * ''include_docs=true''
  You'll get all documents that are pertinent to the group, but in no particular order. The
size of your index will be smaller though.
  
- In general though, you want to avoid storing overly large lists of any kind in a single
document. This means you should place the list on side of the relationship which you expect
to have fewer values. In the example above, the Contact side was chosen because a single person
is not likely to belong to too many groups, whereas in a large contacts database, a group
might contain hundreds of members.
+ In general though, you want to avoid storing overly large lists of any kind in a single
document. The reason is that if your document becomes, say 1MB in size, then you need to upload
1MB to the database every time you want to make a change to any part of the document. Therefore
you should place the list on side of the relationship which you expect to have fewer values.
In the example above, the Contact side was chosen because a single person is not likely to
belong to too many groups, whereas in a large contacts database, a group might contain hundreds
of members.
  
  === Many to Many: Relationship documents ===
  
@@ -205, +208 @@

  
  If this is becoming a problem due to roundtrip times to the database, an acceptable solution
is to duplicate the needed information in the relationship documents. You trade the inconvenience
of maintaining multiple copies of the same data for the low access time to that data. Unless
you have extreme requirements however, you do not need to do this.
  
+ Here, CouchDB differs from traditional SQL systems. With SQL you would be able to get all
the data in one go using two JOIN statements, but you would not be aware that that is in fact
a pretty slow operation. CouchDB only allows you to do things that scale well.
+ 

Mime
View raw message