couchdb-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Couchdb Wiki] Update of "FUQ" by MarcelloNuccio
Date Fri, 23 Dec 2011 11:22:49 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The "FUQ" page has been changed by MarcelloNuccio:
http://wiki.apache.org/couchdb/FUQ?action=diff&rev1=6&rev2=7

Comment:
Added note on deleted documents

  <<Include(EditTheWiki)>>
  
  = Frequently Unasked Questions =
- 
  On IRC and the Mailing List, these are the Questions People should have asked to help them
stay Relaxed.
  
  == Documents ==
+  1. Why should I generate my own UUIDs?
+   . While CouchDB will generate a unique identifier for the _id field of any doc that you
create, there are three reasons why you are in most cases better off generating them yourself.
  
-  1. Why should I generate my own UUIDs?
-     While CouchDB will generate a unique identifier for the _id field of any doc that you
create, there are three reasons why you are in most cases better off generating them yourself.
+   * If for any reason you miss the 200 OK reply from CouchDB, and storing the document is
attempted again, you would end up with the same document content stored under duplicate _ids.
This could easily happen with intermediary proxies and cache systems that may not inform developers
that the failed transaction is being retried.
+   * _ids are are the only unique enforced value within CouchDB so you might as well make
use of this.
  
+   * CouchDB stores its documents in a B+ tree. Each additional or updated document is stored
as a leaf node, and may require re-writing intermediary and parent nodes. You may be able
to take advantage of sequencing your own ids more effectively than the automatically generated
ids if you can arrange them to be sequential yourself.
-    * If for any reason you miss the 200 OK reply from CouchDB, and storing the document
is attempted again, you would end up with the same document content stored under duplicate
_ids. This could easily happen with intermediary proxies and cache systems that may not inform
developers that the failed transaction is being retried.
-    * _ids are are the only unique enforced value within CouchDB so you might as well make
use of this.
  
-    * CouchDB stores its documents in a B+ tree. Each additional or updated document is stored
as a leaf node, and may require re-writing intermediary and parent nodes. You may be able
to take advantage of sequencing your own ids more effectively than the automatically generated
ids if you can arrange them to be sequential yourself.
  
   1. What is the benefit of using the _bulk_docs API instead of PUTting single documents
to CouchDB?
+   . Aside from the HTTP overhead and roundtrip you are saving, the main advantage is that
CouchDB can handle the B tree updates more efficiently, decreasing rewriting of intermediary
and parent nodes, both improving speed and saving disk space.
  
-     Aside from the HTTP overhead and roundtrip you are saving, the main advantage is that
CouchDB can handle the B tree updates more efficiently, decreasing rewriting of intermediary
and parent nodes, both improving speed and saving disk space.
  
   1. Why can't I use MVCC in CouchDB as a revision control system for my docs?
  
+  1. Does compaction remove deleted documents’ contents?
+   . We keep the latest revision of every document ever seen, even if that revision has '"_deleted":true'
in it. This is so that replication can ensure eventual consistency between replicas. Not only
will all replicas agree on which documents are present and which are not, but also the contents
of both.
+ 
+   . Deleted documents specifically allow for a body to be set in the deleted revision. The
intention for this is to have a "who deleted this" type of meta data for the doc. Some client
libraries delete docs by grabbing the current object blob, adding a '"_deleted":true' member,
and then sending it back which inadvertently (in most cases) keeps the last doc body around
after compaction.
+ 
  == Replication ==
- 
   1. What is the difference between PULL and PUSH replication?
   1. Why do I need to permit deleted docs in validation functions?
   1. How do compaction and purging impact replication?
  
  == Views ==
+  1.
+  In a view, why should I not {{{emit(key,doc)}}} ?
  
-  1. In a view, why should I not {{{emit(key,doc)}}} ?
+   .
    The key point here is that by emitting {{{,doc}}} you are duplicating the document which
is already present in the database (a .couch file), and including it in the results of the
view (a different .couch file, with similar structure). This is the same as having a SQL Index
that includes the original table, instead of using a foreign key.
  
    The same effect can be acheived by using {{{emit(key,null)}}} and ?include_docs=true with
the view request. This approach has the benefit of not duplicating the document data in the
view index, which reduces the disk space consumed by the view. On the other hand, the file
access pattern is slightly more expensive for CouchDB. It is usually a premature optimization
to include the document in the view. As always, if you think you may need to emit the document
it's always best to test.
+ 
+ 
  
   1. What happens if I don't ducktype the variables I am using in my view?
   1. Does it matter if my map function is complex, or takes a long time to run?
  
  == Tools ==
+  1.
+  I decided to roll my own !CouchApp tool or CouchDB client in <myfavouritelanguage>.
How cool is that?
  
-  1. I decided to roll my own !CouchApp tool or CouchDB client in <myfavouritelanguage>.
How cool is that?
-    Pretty cool! In fact its a great way to get familiar with the API. However - wrappers
around the HTTP API are not necessarily of great use as CouchDB already makes this very easy.
Mapping CouchDB semantics onto your language's native data structures is much more useful
to people. Many languages are already covered and we'd really like to see your ideas and enhancements
incorporated into the existing tools if possible, and helping to keep them up to date. Ask
on the mailing list about contributing!
+   . Pretty cool! In fact its a great way to get familiar with the API. However - wrappers
around the HTTP API are not necessarily of great use as CouchDB already makes this very easy.
Mapping CouchDB semantics onto your language's native data structures is much more useful
to people. Many languages are already covered and we'd really like to see your ideas and enhancements
incorporated into the existing tools if possible, and helping to keep them up to date. Ask
on the mailing list about contributing!
+ 
  
  == Log Files ==
   1. Those Erlang messages in the log are pretty confusing. What gives?
-    While the Erlang messages in the log can be confusing to someone unfamiliar with Erlang,
with practice they become very helpful. The CouchDB developers do try to catch and log messages
that might be useful to a system administrator in a friendly format, but occassionally a bug
or otherwise unexpected behavior manifests itself in more verbose dumps of Erlang server state.
These messages can be very useful to CouchDB developers. If you find many confusing messages
in your log, feel free to inquire about them. If they are expected, devs can work to ensure
that the message is more cleanly formatted. Otherwise, the messages may indicate a bug in
the code.
+   . While the Erlang messages in the log can be confusing to someone unfamiliar with Erlang,
with practice they become very helpful. The CouchDB developers do try to catch and log messages
that might be useful to a system administrator in a friendly format, but occassionally a bug
or otherwise unexpected behavior manifests itself in more verbose dumps of Erlang server state.
These messages can be very useful to CouchDB developers. If you find many confusing messages
in your log, feel free to inquire about them. If they are expected, devs can work to ensure
that the message is more cleanly formatted. Otherwise, the messages may indicate a bug in
the code.
-    In many cases, this is enough to identify the problem. For example, OS errors are reported
as tagged tuples {{{{error,enospc}}}} or {{{{error,enoacces}}}} which respectively is "You
ran out of disk space", and "CouchDB doesn't have permission to access that resource". Most
of these errors are derived from C used to build the Erlang VM and are documented in {{{errno.h}}}
and related header files. [[http://www.ibm.com/developerworks/aix/library/au-errnovariable/|IBM]]
provides a good introduction to these, and the relevant [[http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html|POSIX]]
and [[http://www.gnu.org/s/hello/manual/libc/Error-Codes.html|GNU]] and [[http://msdn.microsoft.com/en-us/library/5814770t.aspx|Microsoft
Windows]] standards will cover most cases.
+   In many cases, this is enough to identify the problem. For example, OS errors are reported
as tagged tuples {{{{error,enospc}}}} or {{{{error,enoacces}}}} which respectively is "You
ran out of disk space", and "CouchDB doesn't have permission to access that resource". Most
of these errors are derived from C used to build the Erlang VM and are documented in {{{errno.h}}}
and related header files. [[http://www.ibm.com/developerworks/aix/library/au-errnovariable/|IBM]]
provides a good introduction to these, and the relevant [[http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html|POSIX]]
and [[http://www.gnu.org/s/hello/manual/libc/Error-Codes.html|GNU]] and [[http://msdn.microsoft.com/en-us/library/5814770t.aspx|Microsoft
Windows]] standards will cover most cases.
  

Mime
View raw message