From user-return-4708-apmail-couchdb-user-archive=couchdb.apache.org@couchdb.apache.org Mon May 04 18:24:09 2009 Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 14267 invoked from network); 4 May 2009 18:24:09 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 May 2009 18:24:09 -0000 Received: (qmail 30575 invoked by uid 500); 4 May 2009 18:24:08 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 30491 invoked by uid 500); 4 May 2009 18:24:07 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 30481 invoked by uid 99); 4 May 2009 18:24:07 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 May 2009 18:24:07 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=NORMAL_HTTP_TO_IP,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jchris@gmail.com designates 209.85.132.247 as permitted sender) Received: from [209.85.132.247] (HELO an-out-0708.google.com) (209.85.132.247) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 May 2009 18:23:58 +0000 Received: by an-out-0708.google.com with SMTP id b6so2108821ana.5 for ; Mon, 04 May 2009 11:23:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to :content-type:content-transfer-encoding; bh=khsRM9XnUFgo2O/VfRltuFwX5ETWaWDyXMM43cc/Adw=; b=Fd2hKNzIcrJUMR2I6KNv6clJyPPmc65QVjg1HgrcBkGZSnBkw3tqRMGNmtiWBF8dCH zafoc+OBWOutmTGUW4o2VjMxWf0AvXWfnWVW6E2dtfshPEuNUxO7fJEiehkC+ldKnEj2 /dcuesJfjYy4BCsdiU/WhDzGYdA21VtfkF3cc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=uKCW7CmhIXG0yAmt89Igdzp39w33RhUtYkg8DyZjBvtwWiXmgAWYqnmxy22leWvB/X e7egc8cMTpkhC5etn8MuCLu4O2YZt4Rcq6vVsfpUjOqVrCc0xoW7vYe7EGpFeirAsuv3 J7JhrAB5xEEZy/+8K0lqMnWys1jbihZXu7HJk= MIME-Version: 1.0 Sender: jchris@gmail.com Received: by 10.100.138.10 with SMTP id l10mr13702587and.61.1241461417143; Mon, 04 May 2009 11:23:37 -0700 (PDT) In-Reply-To: References: Date: Mon, 4 May 2009 11:23:37 -0700 X-Google-Sender-Auth: b7aa0ed59845d8f7 Message-ID: Subject: Re: Insert performance From: Chris Anderson To: user@couchdb.apache.org Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Mon, May 4, 2009 at 10:20 AM, Tom Nichols wrote: > well, if I set "batch" to true, I all of my load scripts die after a > short amount of time with this error: > > /var/lib/gems/1.8/gems/couchrest-0.24/lib/couchrest/monkeypatches.rb:41:i= n > `rbuf_fill': uninitialized constant Timeout::TimeoutError (NameError) > =A0 =A0 =A0 =A0from /usr/lib/ruby/1.8/net/protocol.rb:116:in `readuntil' > =A0 =A0 =A0 =A0from /usr/lib/ruby/1.8/net/protocol.rb:126:in `readline' > =A0 =A0 =A0 =A0from /usr/lib/ruby/1.8/net/http.rb:2020:in `read_status_li= ne' > > Regardless, it still seems like there is a bottleneck on the server > end. =A0Did I mention I'm running the 'load' scripts locally? =A0So it's > not network latency that is causing the slowness. =A0Any other ideas? > You're probably best of using explicit bulk_docs saves with an array of documents. That way you know how much you are passing to CouchDB at a time. With smallish docs (less than a few kb) you can usually do around 1000 at a time to get the best insert performance. > Thanks. > -Tom > > > On Mon, May 4, 2009 at 12:19 PM, Zachary Zolton > wrote: >> Yeah, the optional second argument =97for usign bulk save semantics=97 >> defaults to false. >> >> Also, there's an option where you can set how many documents to batch >> save at a time. I don't remember the default, but I've had good luck >> saving with anywhere between 500 and 2000 docs. >> >> On Mon, May 4, 2009 at 11:13 AM, Tom Nichols wrote= : >>> Thanks. =A0I'm using save_doc, I just need to pass 'true' as a second a= rgument? >>> >>> I posted the question here because I assumed the performance >>> bottleneck was on the CouchDB end, not my ruby script. =A0Am I wrong? I >>> assumed if I was running 20 "slow" ruby scripts they would peg the >>> CPU. =A0The fact that I'm not seeing that makes me think there is some >>> blocking/ synchronization that is making the CouchDB server slow....? >>> >>> Thanks again. >>> -Tom >>> >>> On Mon, May 4, 2009 at 11:58 AM, Zachary Zolton >>> wrote: >>>> Short answer: use db.save_doc(hash, true) for bulk_docs behavior. >>>> >>>> Also, consider moving this thread to the CouchRest Google Group: >>>> http://groups.google.com/group/couchrest/topics >>>> >>>> Cheers, >>>> zdzolton >>>> >>>> On Mon, May 4, 2009 at 10:40 AM, Tom Nichols wro= te: >>>>> Hi, I have some questions about insert performance. >>>>> >>>>> I have a single CouchDB 0.9.0 node running on small EC2 instance. =A0= I >>>>> attached a huge EBS volume to it and mounted it where CouchDB's data >>>>> files are stored. =A0I fired up about ruby scripts running inserts an= d >>>>> after a weekend I only have about 30GB/ 12M rows of data... =A0Which >>>>> seems small. =A0'top' tells me that my CPU is only about 30% utilized= . >>>>> >>>>> Any idea what I might be doing wrong? =A0I pretty much just followed >>>>> these instructions: >>>>> http://wiki.apache.org/couchdb/Getting_started_with_Amazon_EC2 >>>>> >>>>> My ruby script looks like this: >>>>> #!/usr/bin/env ruby >>>>> #Script to load random data into CouchDB >>>>> >>>>> require 'rubygems' >>>>> require 'couchrest' >>>>> >>>>> db =3D CouchRest.database! "http://127.0.0.1:5984/#{ARGV[0]}" >>>>> puts "Created database: #{ARGV[0]}" >>>>> >>>>> max =3D 9999999999999999 >>>>> while 1 >>>>> =A0 =A0 =A0 =A0puts 'loading...' >>>>> =A0 =A0 =A0 =A0for val in 0..max >>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0db.save_doc({ :key =3D> val, 'val one'= =3D> "val ${val}", >>>>> 'val2' =3D> "#{ARGV[1]} #{val}" }) >>>>> =A0 =A0 =A0 =A0end >>>>> end >>>>> >>>>> >>>>> Thanks in advance... >>>>> >>>> >>> >> > --=20 Chris Anderson http://jchrisa.net http://couch.io