couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Couchdb Wiki] Update of "EntityRelationship" by DavidSharnoff
Date Mon, 12 Jul 2010 16:21:50 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The "EntityRelationship" page has been changed by DavidSharnoff.
The comment on this change is: $scott->{groups} must be an array reference so it needs
to be expanded for grep to work.
http://wiki.apache.org/couchdb/EntityRelationship?action=diff&rev1=21&rev2=22

--------------------------------------------------

  As a quick summary, this document explains how to do things that you would normally use
SQL JOIN for.
  
  == Why would I need entity relationships? ==
- Imagine you are building a snazzy new web application that includes an address book where
users can store their contacts. For each contact the user stores, you want to capture the
contacts name, birthday (which they mustn't forget!) their address, telephone number and company
they work for.
+ Imagine you are building a snazzy new web application that includes an address book where
users can store their contacts. For each contact the user stores, you want to capture the
contacts name, birthday (which they mustn't forget!) their address, telephone number and company
they work for. When the user wants to add an address, they enter the information in to a form
and the form saves the information in a model that looks something like this:
- When the user wants to add an address, they enter the information in to a form and the form
saves the information in a model that looks something like this:
  
  {{{#!highlight javascript
  {
@@ -28, +27 @@

  (Note that ''type'' doesn't mean anything to CouchDB, we're just using it here for our own
convenience. ''_id'' is the only thing CouchDB looks at)
  
  That's great, your users immediately begin to use their address book and soon the datastore
starts to fill up. Not long after the deployment of your new application you hear from someone
that they are not happy that there is only one phone number. What if they want to store someone's
work telephone number in addition to their home number? No problem you think, you can just
add a work phone number to your structure. You change your data structure to look more like
this:
+ 
  {{{
    "phone_number":"home phone in string form",
    "work_phone_number":"work phone in string form",
  }}}
- Update the form with the new field and you are back in business. Soon after redeploying
your application, you get a number of new complaints. When they see the new phone number field,
people start asking for even more fields. Some people want a fax number field, others want
a mobile field. Some people even want more than one mobile field (boy modern life sure is
hectic)! You could add another field for fax, and another for mobile, maybe two. What about
if people have three mobile phones? What if they have ten? What if someone invents a phone
for a place you've never thought of?
+ Update the form with the new field and you are back in business. Soon after redeploying
your application, you get a number of new complaints. When they see the new phone number field,
people start asking for even more fields. Some people want a fax number field, others want
a mobile field. Some people even want more than one mobile field (boy modern life sure is
hectic)! You could add another field for fax, and another for mobile, maybe two. What about
if people have three mobile phones? What if they have ten? What if someone invents a phone
for a place you've never thought of? Your model needs to use relationships.
- Your model needs to use relationships.
  
  == One to Many ==
  The answer is to allow users to assign as many phone numbers to each of their contacts as
they like.
  
  In CouchDB, there are 2 ways to achieve this.
+ 
   1. Use separate documents
-  2. Use an embedded array
+  1. Use an embedded array
  
  === One to Many: Separate documents ===
- 
  When using separate documents, you could have documents like this for the phone numbers:
+ 
  {{{#!highlight javascript
  {
    "_id":"the phone number",
@@ -58, +58 @@

  The key to making all this work is the contact property. By storing the contact id in it,
you can refer to the owning contact in a unique way, since ''_id'' fields are unique in CouchDB
databases.
  
  Creating the relationship between a contact and one of its phone numbers is easy to do.
Let's say you have a contact named "Scott" who has a home phone and a mobile phone. You populate
his contact info like this (using Perl and Net::CouchDB):
+ 
  {{{#!highlight perl
  $db->insert({type => 'contact', _id => 'Scott', name => 'My Friend Scott'});
  $db->insert({type => 'phone', _id => '(650) 555 - 2200', contact_id => 'Scott',
phone_type => 'home'});
  $db->insert({type => 'phone', _id => '(650) 555 - 2201', contact_id => 'Scott',
phone_type => 'mobile'});
  }}}
- 
  To get the contacts and their phone numbers from CouchDB in one search, you need to use
a little trick: You need to create a view that sorts the contacts and their phone numbers
in order. This is the view:
  
  {{{#!highlight javascript
@@ -77, +77 @@

    }
  }
  }}}
- 
- If you then query this view with the ''startkey'' parameter set to "[''''''"Scott"]" and
endkey "[''''''"Scott",{}]", you'll get the contact details in the first row and the phone
numbers in the following rows (sorted by phone_type as well). You can easily extend this system
to have other types of one-to-many attributes in the same view by giving them a different
number in the view above.
+ If you then query this view with the ''startkey'' parameter set to "["Scott"]" and endkey
"["Scott",{}]", you'll get the contact details in the first row and the phone numbers in the
following rows (sorted by phone_type as well). You can easily extend this system to have other
types of one-to-many attributes in the same view by giving them a different number in the
view above.
  
  This is a little bit like a JOIN in SQL although in SQL the data fields would be joined
together on a row where here they are on consecutive rows. This latter approach allows a variable
number of data fields which is more flexible than SQL.
  
  NOTE: This needs a code example showing how to use the output of the view. Feel free to
add one.
  
- Because CouchDB always sorts on keys, you can use this view to only get Scotts home phone
numbers by querying with ''startkey'' set to "[''''''"Scott",1,"home"]" and ''endkey'' set
to "[''''''"Scott",1,"home",{}]"
+ Because CouchDB always sorts on keys, you can use this view to only get Scotts home phone
numbers by querying with ''startkey'' set to "["Scott",1,"home"]" and ''endkey'' set to "["Scott",1,"home",{}]"
  
  When Scott loses his phone, it's easy enough to delete that record. Just delete the phone
document and it can no longer be queried for:
+ 
  {{{#!highlight perl
  $db->doc('(650) 555 - 2200')->delete;
  }}}
- 
  === One to Many: Embedded Documents ===
- 
  The embedded array is only an option as long as you don't have "too many" items to store,
since each document is always handled as a whole and bigger documents mean slower handling
and slower network transfers whenever you want to change the list. Phone numbers should be
ok unless you plan to store the whole company phonebook in there.
  
  This is the easiest way to handle one-to-many as everything you need is in one place. Here's
how the document for Scott would look:
+ 
  {{{#!highlight javascript
  {
    "_id":"Scott",
@@ -104, +103 @@

    "phones":[{"number":"(650) 555 - 2200","type":"home"},{"number":"(650) 555 - 2201","type":"mobile"}],
  }
  }}}
- 
  or even more succinctly
  
  {{{#!highlight javascript
@@ -115, +113 @@

    "phones":[{"home":"(650) 555 - 2200"},{"mobile":"(650) 555 - 2201"}],
  }
  }}}
- 
  Note how only the fields that we know are stored. Also note that the phone numbers are not
simply an array, they are an array of associative hashes. We could extend this with no effort
to add email addresses, IM names etc, even if IM names would need an extra attribute that
has the service type. In essence, you're embedding child documents in the master document.
That is the power of schema-less databases.
  
  == Many to Many ==
  One thing you would like to do is provide the ability for people to organize their contacts
in to groups. They might make groups like "Friends", "Co-workers" and "Family". This would
allow users to use these groups to perform actions en masse, such as maybe sending an invitation
to all their friends for a hack-a-thon. Let's define a simple Group model like this:
+ 
  {{{#!highlight javascript
  {
    "_id":"unique group id",
@@ -128, +126 @@

    "description":"description"
  }
  }}}
- 
  You could make a one-to-many relation with Contact. However, this would allow contacts to
be part of only one group at a time. For example, someone might include some of their co-workers
as friends. You need a way to represent many-to-many relationships.
  
  === Many to Many: List of Keys ===
  One very simple way is to create a list of keys on one side of the relationship, like we
did in the "Embedded One to Many" section.
  
  Our friend and colleague Scott would then get a new field in his contact document which
holds group ''_id'' values:
+ 
  {{{#!highlight javascript
    "groups":["Friends","Colleagues"]
  }}}
- 
  Adding and removing a user to and from a group means working with a list of keys. Suppose
we don't like Scott any more:
+ 
  {{{#!highlight perl
     my $scott = $db->doc('Scott');
-    $scott->{groups} = grep { $_ ne 'Friends' } $scott->{groups};
+    $scott->{groups} = [ grep { $_ ne 'Friends' } @{$scott->{groups}} ];
     $scott->update;
  }}}
- 
  To get all the members of a group, you'd create a view like this:
+ 
  {{{#!highlight javascript
  "map":function(doc) {
     if (doc.type == 'contact') {
@@ -158, +156 @@

     }
  }
  }}}
- 
  If you then query this view with search parameters
+ 
   * ''descending=true''
   * ''key="Friends"''
+ 
  then you'll get all the names of members of the group Friends and the group information
as the first row. (Hashes sort behind strings).
  
  Here's a space optimization hint: If you make the view be
+ 
  {{{#!highlight javascript
  "map":function(doc) {
     if (doc.type == 'contact') {
@@ -177, +177 @@

  }
  }}}
  and query this view with search parameters
+ 
   * ''key="Friends"''
   * ''include_docs=true''
+ 
  You'll get all documents that are pertinent to the group, but in no particular order. The
size of your index will be smaller though.
  
  For the most efficient changes to the relationship list, you should place the list on side
of the relationship which you expect to have fewer values. In the example above, the Contact
side was chosen because a single person is not likely to belong to too many groups, whereas
in a large contacts database, a group might contain hundreds of members.
@@ -190, +192 @@

  
  A final option is to use a separate index, such as couchdb-lucene to help with such queries.
  
- 
  === Many to Many: Relationship documents ===
- 
  Another way of implementing many-to-many is by creating a separate document for each relationship.
  
  You would use this method if you modify the key list frequently (i.e. if you get more conflicts
than is acceptable), or if the key list is so large that transferring the document is unacceptably
slow. Relationship documents enable frequent changes with less chance of conflict; however,
you can access neither the contact nor group information in one request. You must re-request
those specific documents by ID, keeping in mind that they may change or be deleted in the
interim.
  
  A document explaining that Scott is a Friend would look like
+ 
  {{{#!highlight javascript
  {
    "_id":"some unique id",
@@ -206, +207 @@

    "group_id":"Friends"
  }
  }}}
- 
- 
  If you then want to know who is in a group you'll need to use the view (fetch descending
to get the group info first)
+ 
  {{{#!highlight javascript
  "map":function(doc) {
     if (doc.type == 'relationship') {
@@ -218, +218 @@

     }
  }
  }}}
- 
  To know what groups a contact belongs to you can use
+ 
  {{{#!highlight javascript
  "map":function(doc) {
     if (doc.type == 'relationship') {
@@ -229, +229 @@

     }
  }
  }}}
- Note that this view uses key arrays to enforce sorting, just to show you the possible variations.
The disadvantage is that you can't use ''key="Scott"'' to search for Scotts groups, you need
to use ''startkey=[''''''"Scott"]&endkey=["Scott",{}]''.
+ Note that this view uses key arrays to enforce sorting, just to show you the possible variations.
The disadvantage is that you can't use ''key="Scott"'' to search for Scotts groups, you need
to use ''startkey=["Scott"]&endkey=["Scott",{}]''.
  
  Unlike the previous method, you can't use ''include_docs=true'' now to get all information
about the contacts that are in a group or the groups that a contact has. The reason is that
the original documents that were used in generating the view are not the contact or group
documents, they are the relationship documents. If you want that information, you'll have
to fetch it separately (you can use a POST view to ''/db/_all_docs'' to grab a bunch of documents
in one go).
  

Mime
View raw message