couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <>
Subject Re: Couch and Varnish
Date Sat, 13 Nov 2010 18:45:46 GMT
"In any case, when we're in "every 1 request to cache means 1 request
to database" situation, "caching" is truly pointless.

Not true. Consider attachments or view query results, checking that
the cached result is still fresh is faster than redoing the work (or
copying the attachment again). It's only (almost) pointless when
fetching documents themselves.

What improvement could be made here? It seems wrong to return a cached
copy of a document without checking that it is fresh, and my read of
2616 says we mustn't.

Sent from my iPad

On 13 Nov 2010, at 16:36, "Karel Minařík" <> wrote:

> Hi,
> I am ashamed to reply so late, sorry, I got lost in other stuff on Monday. I'll combine
my replies:
> On Mon, Nov 8, 2010 at 08:17, Zachary Zolton <> wrote:
>>>>>> Of course, you'd be stuck with manually tracking the types of URLs
>>>>>> purged, so I haven't been too eager to try it out yet...
> Yes, that's precisely what I'd like to avoid. It's not _that_ hard of course, and Couch
provides awesome entry point for the invalidation in _changes or update_notifier, but still...
> On 9.Nov, 2010, at 24:42 , Robert Newson wrote:
>> I think it's clear that caching via ETag for documents is close to
>> pointless (the work to find the doc in the b+tree is over 90% of the
>> work and has to be done for GET or HEAD).
> Yes. I wonder if there's any room for improvement on Couch's part. In any case, when
we're in "every 1 request to cache means 1 request to database" situation, "caching" is truly
> On Mon, Nov 8, 2010 at 11:11 PM, Zachary Zolton <> wrote:
>>> That makes sense: if every request to the caching proxy checks the
>>> etag against CouchDB via a HEAD request—and CouchDB currently does
>>> just as much work for a HEAD as it would for a GET—you're not going to
>>> see an improvement.
> Yes. But that's not the only scenario imaginable. I'd repeat what I wrote to the Varnish
mailing list []:
> 1. The cache can "accumulate" requests to a certain resource for a certain (configurable?)
period of time (1 second, 1 minute, ...) and ask the backend less often -- accelerating througput.
> 2. The cache can return "possibly stale" content immediately and check with the backend
afterwards (on the background, when n-th next request comes, ...) -- accelerating response
> It was my impression, that at least the first option is doable with Varnish (via some
playing with the grace period), but I may be severely mistaken.
> On Mon, Nov 8, 2010 at 5:04 PM, Randall Leeds <> wrote:
>>>> If you have a custom caching policy whereby
>>>> the proxy will only check the ETag against the authority (Couch) once
>>>> per (hour, day, whatever) then you'll get a speedup. But if your proxy
>>>> performs a HEAD request for every incoming request you will not see
>>>> much performance gain.
> P-r-e-c-i-s-e-ly. If we can tune Varnish or Squid to not be so "dumb" and check with
the backend based on some configs like this, we could use it for proper self-invalidating
caching. (As opposed to TTL-based caching, which bring the manual expiration issues discussed
above.) Unfortunately, at least based on the answers I got, this just not seems to be possible.
> On Mon, Nov 8, 2010 at 12:06, Randall Leeds <> wrote
>>>>> It'd be nice if the "Couch is HTTP and can leverage existing caches and
>>>>> talking point truly included significant gains from etag caching.
> P-R-E-C-I-S-E-L-Y. This is, for me, the most important, and embarrassing issue of this
discussion. The O'Reilly book has it all over the place:
 Whenever you tell someone who really knows about HTTP caches "Dude, Couch is HTTP and can
leverage existing caches and tools" you can and will be laughed at -- you can get away with
mentioning expiration based caching and "simple" invalidation via _changes and such, but...
Embarrassing still.
> I'll try to do more research in this area, when time permits. I don't believe there's
_not_ some arcane Varnish config option to squeeze some performance eg. in the "highly concurrent
requests" scenario.
> Thanks for all the replies!,
> Karel

View raw message