Return-Path: Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: (qmail 9989 invoked from network); 13 Nov 2010 16:36:06 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 13 Nov 2010 16:36:06 -0000 Received: (qmail 34900 invoked by uid 500); 13 Nov 2010 16:36:36 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 34868 invoked by uid 500); 13 Nov 2010 16:36:36 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 34859 invoked by uid 99); 13 Nov 2010 16:36:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Nov 2010 16:36:35 +0000 X-ASF-Spam-Status: No, hits=2.5 required=10.0 tests=FREEMAIL_FROM,FREEMAIL_REPLY,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of karel.minarik@gmail.com designates 209.85.214.52 as permitted sender) Received: from [209.85.214.52] (HELO mail-bw0-f52.google.com) (209.85.214.52) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Nov 2010 16:36:29 +0000 Received: by bwz4 with SMTP id 4so3890322bwz.11 for ; Sat, 13 Nov 2010 08:36:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:references:x-mailer; bh=2nDQSZMaFw2C5hxobi6Lrl4f9AMBTZxs/GWvJeWicqM=; b=twlvfbfwhZeNhycIwBxatIclDKadZNhVSXXJCEHulk0hMM0yulgsxS1+qF/ThVnCKs FeCrJOgcvkh5EiP3Rl1yKtm5oT/DQzbnSMuS56Quwh49QzBXU7wivhIR61l0I0Yw0Ojt RkhleSjKj4G1jndUlbrY7mcMTL7dIVvvCgTYI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-mailer; b=NXoZWABbYs5zoW+UFQDmw9YHYChThPC1ajS+aa5+qWvXorvfNwfRjTSYoaMhSW73XP jqlBiSQn7cGNT4eQtCtH7yaeYTa5Np3mmX/uk0KgYyp54fjt5CiwgxmLjj9V/oPYWT0s chWYsE2fK/QXvqGzqZvG1hgitdNk/kAPYoP0A= Received: by 10.204.72.77 with SMTP id l13mr1209451bkj.193.1289666166265; Sat, 13 Nov 2010 08:36:06 -0800 (PST) Received: from [192.168.2.104] (a40-prg1-10-118.static.adsl.vol.cz [88.146.57.118]) by mx.google.com with ESMTPS id p22sm2101600bkp.9.2010.11.13.08.36.02 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 13 Nov 2010 08:36:03 -0800 (PST) Message-Id: <4085F3E3-7F44-4C45-B607-8C74CC4E4650@gmail.com> From: =?UTF-8?Q?Karel_Mina=C5=99=C3=ADk?= To: user@couchdb.apache.org In-Reply-To: Content-Type: text/plain; charset=WINDOWS-1252; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v936) Subject: Re: Couch and Varnish Date: Sat, 13 Nov 2010 17:36:01 +0100 References: X-Mailer: Apple Mail (2.936) Hi, I am ashamed to reply so late, sorry, I got lost in other stuff on =20 Monday. I'll combine my replies: On Mon, Nov 8, 2010 at 08:17, Zachary Zolton =20 wrote: >>>>> Of course, you'd be stuck with manually tracking the types of =20 >>>>> URLs to >>>>> purged, so I haven't been too eager to try it out yet... Yes, that's precisely what I'd like to avoid. It's not _that_ hard of =20= course, and Couch provides awesome entry point for the invalidation in =20= _changes or update_notifier, but still... On 9.Nov, 2010, at 24:42 , Robert Newson wrote: > I think it's clear that caching via ETag for documents is close to > pointless (the work to find the doc in the b+tree is over 90% of the > work and has to be done for GET or HEAD). Yes. I wonder if there's any room for improvement on Couch's part. In =20= any case, when we're in "every 1 request to cache means 1 request to =20 database" situation, "caching" is truly pointless. On Mon, Nov 8, 2010 at 11:11 PM, Zachary Zolton = wrote: >> That makes sense: if every request to the caching proxy checks the >> etag against CouchDB via a HEAD request=97and CouchDB currently does >> just as much work for a HEAD as it would for a GET=97you're not going = =20 >> to >> see an improvement. Yes. But that's not the only scenario imaginable. I'd repeat what I =20 wrote to the Varnish mailing list = [http://lists.varnish-cache.org/pipermail/varnish-misc/2010-November/00499= 3.html=20 ]: 1. The cache can "accumulate" requests to a certain resource for a =20 certain (configurable?) period of time (1 second, 1 minute, ...) and =20 ask the backend less often -- accelerating througput. 2. The cache can return "possibly stale" content immediately and check =20= with the backend afterwards (on the background, when n-th next request =20= comes, ...) -- accelerating response time. It was my impression, that at least the first option is doable with =20 Varnish (via some playing with the grace period), but I may be =20 severely mistaken. On Mon, Nov 8, 2010 at 5:04 PM, Randall Leeds =20 wrote: >>> If you have a custom caching policy whereby >>> the proxy will only check the ETag against the authority (Couch) =20 >>> once >>> per (hour, day, whatever) then you'll get a speedup. But if your =20 >>> proxy >>> performs a HEAD request for every incoming request you will not see >>> much performance gain. P-r-e-c-i-s-e-ly. If we can tune Varnish or Squid to not be so "dumb" =20= and check with the backend based on some configs like this, we could =20 use it for proper self-invalidating caching. (As opposed to TTL-based =20= caching, which bring the manual expiration issues discussed above.) =20 Unfortunately, at least based on the answers I got, this just not =20 seems to be possible. On Mon, Nov 8, 2010 at 12:06, Randall Leeds =20= wrote >>>> It'd be nice if the "Couch is HTTP and can leverage existing =20 >>>> caches and tools" >>>> talking point truly included significant gains from etag caching. P-R-E-C-I-S-E-L-Y. This is, for me, the most important, and =20 embarrassing issue of this discussion. The O'Reilly book has it all =20 over the place: = http://www.google.com/search?q=3Dvarnish+OR+squid+site:http://guide.couchd= b.org=20 . Whenever you tell someone who really knows about HTTP caches "Dude, =20= Couch is HTTP and can leverage existing caches and tools" you can and =20= will be laughed at -- you can get away with mentioning expiration =20 based caching and "simple" invalidation via _changes and such, but... =20= Embarrassing still. I'll try to do more research in this area, when time permits. I don't =20= believe there's _not_ some arcane Varnish config option to squeeze =20 some performance eg. in the "highly concurrent requests" scenario. Thanks for all the replies!, Karel