couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karel Minařík <karel.mina...@gmail.com>
Subject Re: Couch and Varnish
Date Sat, 13 Nov 2010 16:36:01 GMT
Hi,

I am ashamed to reply so late, sorry, I got lost in other stuff on  
Monday. I'll combine my replies:

On Mon, Nov 8, 2010 at 08:17, Zachary Zolton  
<zachary.zolton@gmail.com> wrote:
>>>>> Of course, you'd be stuck with manually tracking the types of  
>>>>> URLs to
>>>>> purged, so I haven't been too eager to try it out yet...

Yes, that's precisely what I'd like to avoid. It's not _that_ hard of  
course, and Couch provides awesome entry point for the invalidation in  
_changes or update_notifier, but still...

On 9.Nov, 2010, at 24:42 , Robert Newson wrote:
> I think it's clear that caching via ETag for documents is close to
> pointless (the work to find the doc in the b+tree is over 90% of the
> work and has to be done for GET or HEAD).

Yes. I wonder if there's any room for improvement on Couch's part. In  
any case, when we're in "every 1 request to cache means 1 request to  
database" situation, "caching" is truly pointless.

On Mon, Nov 8, 2010 at 11:11 PM, Zachary Zolton <zachary.zolton@gmail.com 
 > wrote:
>> That makes sense: if every request to the caching proxy checks the
>> etag against CouchDB via a HEAD request—and CouchDB currently does
>> just as much work for a HEAD as it would for a GET—you're not going  
>> to
>> see an improvement.

Yes. But that's not the only scenario imaginable. I'd repeat what I  
wrote to the Varnish mailing list [http://lists.varnish-cache.org/pipermail/varnish-misc/2010-November/004993.html

]:
1. The cache can "accumulate" requests to a certain resource for a  
certain (configurable?) period of time (1 second, 1 minute, ...) and  
ask the backend less often -- accelerating througput.
2. The cache can return "possibly stale" content immediately and check  
with the backend afterwards (on the background, when n-th next request  
comes, ...) -- accelerating response time.
It was my impression, that at least the first option is doable with  
Varnish (via some playing with the grace period), but I may be  
severely mistaken.

On Mon, Nov 8, 2010 at 5:04 PM, Randall Leeds  
<randall.leeds@gmail.com> wrote:
>>> If you have a custom caching policy whereby
>>> the proxy will only check the ETag against the authority (Couch)  
>>> once
>>> per (hour, day, whatever) then you'll get a speedup. But if your  
>>> proxy
>>> performs a HEAD request for every incoming request you will not see
>>> much performance gain.

P-r-e-c-i-s-e-ly. If we can tune Varnish or Squid to not be so "dumb"  
and check with the backend based on some configs like this, we could  
use it for proper self-invalidating caching. (As opposed to TTL-based  
caching, which bring the manual expiration issues discussed above.)  
Unfortunately, at least based on the answers I got, this just not  
seems to be possible.

On Mon, Nov 8, 2010 at 12:06, Randall Leeds <randall.leeds@gmail.com>  
wrote
>>>> It'd be nice if the "Couch is HTTP and can leverage existing  
>>>> caches and tools"
>>>> talking point truly included significant gains from etag caching.

P-R-E-C-I-S-E-L-Y. This is, for me, the most important, and  
embarrassing issue of this discussion. The O'Reilly book has it all  
over the place: http://www.google.com/search?q=varnish+OR+squid+site:http://guide.couchdb.org

.  Whenever you tell someone who really knows about HTTP caches "Dude,  
Couch is HTTP and can leverage existing caches and tools" you can and  
will be laughed at -- you can get away with mentioning expiration  
based caching and "simple" invalidation via _changes and such, but...  
Embarrassing still.

I'll try to do more research in this area, when time permits. I don't  
believe there's _not_ some arcane Varnish config option to squeeze  
some performance eg. in the "highly concurrent requests" scenario.

Thanks for all the replies!,

Karel


Mime
View raw message