incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Cottlehuber <...@jsonified.com>
Subject Re: Exist test?
Date Tue, 06 Nov 2012 08:15:50 GMT
On 5 November 2012 19:22, Kevin Burton <rkevinburton@charter.net> wrote:
> I am calling CreateDocument<Document>() but I suspect that testing if the document
exists first may perform better in the long run. I am using DreamSeat for my driver but I
suspect other drivers have a similar "test". My problem is that I don't know what to test
for and I am unfamiliar with the available methods. Any one successfully use such a pattern
(preferably with DreamSeat) that tests for existence then creates if the document doesn't
exist? Keep in mind I don't initially have an id. Thank you.
>

Hi Kevin,

A number of folk have said "read the guide first" and it's sound
advice http://guide.couchdb.org/draft/index.html as you're stuck with
conceptual stuff that's well covered in the book. I recommend skipping
the sofa chapter (it was written some time ago).

Secondly, I recommend having a play at first with pure HTTP i.e. curl
or similar. This is simply so you get a real feel for how your data is
actually stored and manipulated, before layering an abstraction on
top. It *will* save you time in the short run, and it's not scary. I
learned a huge amount about HTTP itself along the way and I'm
definitely not done yet/ So it's all good. You can also watch the HTTP
calls in & out of futon using Chrome Debugger or some other proxy like
CharlesProxy in between.

I'm assuming you are using Windows (yay!) so
http://wiki.apache.org/couchdb/Quirks_on_Windows has some tips on
using curl successfully, and I'm happy to help out if you get stuck.
Let me know if anything is not clear or out of date.

If you're initially bulk uploading data, I would do 3 things
differently to what you're currently doing.

1. assign UUIDs myself
This is the only enforced unique indexed attribute in a DB, so use it
well. Put something you want in it. It's basically free text ** within
reason.

2. insert them in sorted UUID order
CouchDB is a database and sorting matters. Couch uses a B~tree ** and
so if you insert randomly you spend a lot of time forcing the re-write
of intermediate nodes for no gain. As Couch is an append-only
datastore this means several things -
- wasted space until you compact
- slower insert performance as you have multiple writes instead of one
http://horicky.blogspot.co.at/2008/10/couchdb-implementation.html

3. try inserting the first few docs by hand with curl. And read up on
the _bulk_docs API, this is much much faster.

Re your drivers, there are several but I personally don't use any of
them. There are more popular ones (based on my dodgy recollection)
here http://wiki.apache.org/couchdb/Related_Projects hopefully some of
the other Windows folk will pipe up.

A+
Dave

** handwavey

Mime
View raw message