Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 91D7F8392 for ; Tue, 13 Sep 2011 08:46:36 +0000 (UTC) Received: (qmail 35313 invoked by uid 500); 13 Sep 2011 08:46:35 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 34746 invoked by uid 500); 13 Sep 2011 08:46:34 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 34738 invoked by uid 99); 13 Sep 2011 08:46:33 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Sep 2011 08:46:33 +0000 Received: from localhost (HELO mail-gx0-f182.google.com) (127.0.0.1) (smtp-auth username rnewson, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Sep 2011 08:46:33 +0000 Received: by gxk28 with SMTP id 28so370355gxk.27 for ; Tue, 13 Sep 2011 01:46:32 -0700 (PDT) MIME-Version: 1.0 Received: by 10.42.154.134 with SMTP id q6mr339833icw.265.1315903592202; Tue, 13 Sep 2011 01:46:32 -0700 (PDT) Received: by 10.231.79.210 with HTTP; Tue, 13 Sep 2011 01:46:31 -0700 (PDT) In-Reply-To: References: Date: Tue, 13 Sep 2011 09:46:31 +0100 Message-ID: Subject: Re: Even more fine-grained ETag support when querying views? From: Robert Newson To: user@couchdb.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable My joke about bloom filters was apparently misunderstood but the notion above, which sounds a lot like a Merkle tree, seems lucid to me. As for the strong vs. weak ETag variants, I think views need strong ETags in all cases, given the declared semantics for them in 13.3.3 B. On 12 September 2011 23:28, Paul Davis wrote: > In general the idea is intriguing. Using a combining hash would allow > you to get a specific hash value for a given range. Unfortunately, > bloom filters are not a good solution here because they require an a > priori guess of the number of keys that are going to be stored. On the > other hand, CRC32 appears to be combinable.There are a couple issues > though. The first of which is whether this is a strong enough hash to > use for an ETag. There are two types of ETags with slightly different > semantics, so we'd have to figure out what we can do and where this > falls on that spectrum. Secondly, computing the range ETag would > require the equivalent of a reduce=3Dfalse view call in addition to > streaming the output if validation matched which has performance > implications as well. > > On Mon, Sep 12, 2011 at 6:50 PM, Alon Keren wrote: >> Disclosure: I don't know much about e-tags, CouchDB internals (or bloom >> filters). >> >> How about maintaining an e-tag for each sub-tree in the view, similar to= the >> way (I think) reduce works? >> When a row gets updated, its e-tag would be recalculated, and then its >> parent's e-tag would be recalculated, and so on. The e-tag of an interna= l >> node could be the hash of all its children's hashes. >> The actual e-tag that a view-query receives: the e-tag of the common >> ancestor of all involved rows. >> >> The next time you query the same keys, you would supply the e-tag you've >> just received. >> >> =A0Alon >> >> >> On 10 September 2011 16:41, Andreas Lind Petersen < >> andreaslindpetersen@gmail.com> wrote: >> >>> Hi! >>> >>> Background: I'm working on a web app that uses a single CouchDB databas= e >>> for >>> storing data belong to 400000+ users. Each user has an average of about= 40 >>> documents that need to be fetched in one go when the frontend is launch= ed. >>> I >>> have accomplished this by querying a simple view with ?key=3DownerID (w= ith a >>> fallback to /_alldocs?startkey=3D_...&endkey=3D~ if t= he view >>> isn't built). Since the data for each user rarely changes, there's a >>> potential to save resources by supporting conditional GET with >>> If-None-Match, which would amount having the web app backend copy the >>> CouchDB-generated ETag into the response sent to the browser. >>> >>> However, I just learned that CouchDB only maintains a single ETag for t= he >>> entire view, so every time one of my users changes something, the ETag = for >>> everyone else's query result also changes. This makes conditional GETs >>> useless with this usage pattern. >>> >>> I asked about this on #couchdb and had a brief talk with rnewson, who w= as >>> sympathetic to the idea. Unfortunately we weren't able to come up with = an >>> idea that didn't involve traversing all docs in the result just for >>> computing the ETag (my suggestion was a hash of the _revs of all docs >>> contributing to the result). That would be a bad default, but might sti= ll >>> work as an opt-in thing per request, eg. slowetag=3Dtrue. >>> >>> Newson said I should try raising the discussion here in case someone el= se >>> had an idea for a cheaper way to calculate a good ETag. So what does >>> everyone else think about this? Is my use case too rare, or would it be >>> worthwhile to implement it? >>> >>> Best regards, >>> Andreas Lind Petersen (papandreou) >>> >>> Here's our chat transcript: >>> >>> [11:46] Does anyone know if there are plans for issuing ev= en >>> more granular etags for view lookups? When you only look up a small ran= ge >>> or >>> a specific key it would be really great if the ETag only changed when t= hat >>> subset changes rather than the entire view. >>> [11:47] In the application I'm working on I'll hardly ever= be >>> able to get a 304 response because of this. >>> [...] >>> [13:51] <+rnewson> papandreou: unlikely. >>> [13:52] rnewson: So the best thing I can do is to fetch th= e >>> data and compute a better etag myself? (My use case is a backend for a = web >>> app) >>> [13:53] <+rnewson> papandreou: You might be able to set ETag in a list >>> function? If you can't, I'll gladly change CouchDB so you can. >>> [13:54] rnewson: I thought about that, too, but that would >>> cause a big overhead for every request, right? >>> [13:55] rnewson: (Last time I tried views were slooow) >>> [13:55] I mean lists >>> [13:55] <+rnewson> papandreou: slower, yes, because couch needs to eval= uate >>> the javascript in an external process. >>> [13:55] <+rnewson> how will you calculate the fine-grained ETag? >>> [13:56] <+rnewson> Also we did recently make it slightly finer, before = it >>> was view group scope and now it's the view itself (I think) >>> [13:56] rnewson: Maybe something like a hash of the _revs = of >>> all the documents contributing to the result? >>> [13:56] <+rnewson> hm, that makes no sense actually. but we did refine = it >>> recently. >>> [13:57] <+rnewson> papandreou: that doesn't sound cheap at all, and it >>> would >>> need to be cheaper than doing the view query itself to make sense. >>> [13:58] rnewson: There's still the bandwidth thing >>> [13:58] <+rnewson> oh, you're working with restricted bandwidth and/or = have >>> huge view responses? >>> [13:59] rnewson: And it would be really nice to have somet= hing >>> like this completely handled by the database instead of inventing a bun= ch >>> of >>> workarounds. >>> [14:01] <+rnewson> If there's a correct and efficient algorithm for doi= ng >>> it, I'm sure it would be applied. >>> [14:02] rnewson: I guess it depends on the use case. If th= e >>> database is rarely updated I suppose the current tradeoff is better. >>> [14:03] <+rnewson> I'm sure the only reason we have ETags at the curren= t >>> granularity is because it's very quick to calculate. A finer-grain woul= d be >>> committed if a viable approach was proposed. >>> [14:04] rnewson: I have a huge database with data belongin= g to >>> 400000+ different users, and I'm using a view to enable a lookup-by-own= er >>> thing. But every time a single piece of data is inserted, the ETag for = the >>> view changes >>> [14:04] =3D=3D case_ [~case@AMontsouris-651-1-123-169.w83-202.abo.wanad= oo.fr] >>> has quit [Read error: Connection reset by peer] >>> [14:04] <+rnewson> yes, I've completely understood the problem you stat= ed >>> earlier. >>> [14:05] <+rnewson> I can't think of a way to improve this right now but= I >>> would spend the time to implement it if you had one. >>> [14:06] rnewson: So right now the code path that sends a 3= 04 >>> only needs to look at a single piece of metadata for the view to make i= ts >>> decision? That'll be hard to beat :) >>> [14:07] <+rnewson> doesn't need to beat it, it just needs to be fast. >>> [14:07] <+rnewson> but I don't see any current possible solutions, let >>> alone >>> fast ones. >>> [14:07] rnewson: Well, thanks anyway for considering my >>> suggestion. I'll let you know of I get an idea :) >>> [14:08] <+rnewson> and it is now per-view and not per-viewgroup. so it'= s >>> what I said first before I thought it was silly >>> [14:08] <+benoitc> query + last seq returned maybe .... >>> [14:08] <+rnewson> but obviously a change could affect one view in a gr= oup >>> but not others >>> [14:09] benoitc: The query is already sort of included sin= ce >>> it's in the url. >>> [14:09] <+rnewson> benoitc: ? >>> [14:10] <+benoitc> i was meaning last committed seq,but it won't change >>> anything ... >>> [14:10] benoitc: I guess you'd also need to make sure that= the >>> ETag changes if a document is deleted? >>> [14:10] ah >>> [14:10] <+rnewson> benoitc: we already use the update_seq of the #view, >>> which is finer-grained that db's last committed seq >>> [14:11] <+benoitc> rnewson: commited seq in the view group but anyway i= t >>> won't work >>> [14:12] <+rnewson> benoitc: right, that would be the pre-1.1.0 behavior= , I >>> think. >>> [14:12] <+rnewson> which is coarser >>> [14:12] <+rnewson> we simply don't record the info that papandreou's >>> suggestion would need to work. >>> [14:12] <+benoitc> papandreou: easier solution would be to request each >>> time >>> on on stale view >>> [14:13] rnewson: Another reason why my suggestion sucks is >>> that >>> it would require two traversals of the range, right? I'm guessing it st= arts >>> streaming as soon as it has found the first doc now? >>> [14:13] <+benoitc> and update after, think it would work. except if you >>> want >>> something strict >>> [14:13] <+rnewson> papandreou: yes, we stream the results as we read th= em, >>> we don't buffer. >>> [14:14] benoitc: Hmm, so the theory is that stale=3Dok wou= ld >>> increase the percentage of 304 responses? >>> [14:14] rnewson: Right, yes, then it would take a serious = hit. >>> [14:14] <+rnewson> papandreou: but we could add an option that reads th= e >>> thing, builds an etag, and then streams the result. it would be slower,= but >>> for the times that we can send 304 we'd save bandwidth. It sounds a bit= too >>> niche to me, but you could raise it on user@ >>> [14:15] =3D=3D Frippe [~Frippe@unaffiliated/frippe] has quit [Ping time= out: >>> 240 >>> seconds] >>> [14:15] rnewson: Would be awesome to have that as a >>> configuration option >>> [14:15] <+rnewson> papandreou: the view would not change, so neither wo= uld >>> the ETag (with stale=3Dok) >>> [14:15] <+rnewson> papandreou: I think it would be a runtime option >>> ?slow_etag=3Dtrue >>> [14:15] rnewson: That would also be fine >>> [14:16] <+rnewson> a better solution would not require two passes, thou= gh. >>> [14:16] <+benoitc> papandreou: i would use stale=3Dok, then query the v= iew >>> async, save new etag & ... >>> [14:16] rnewson: I really don't think it's that niche :). = But >>> maybe ETag-nerds are rarer than I think, hehe >>> [14:16] <+benoitc> rnewson: that could encourage pretty dangerous thing= s >>> [14:16] <+rnewson> benoitc: ? >>> [14:17] <+benoitc> rnewson: cpu intensives tasks eacht time the call is >>> done, >>> [14:17] <+benoitc> rather than encouraging something async >>> [14:18] <+benoitc> rahh I hate osx, it introduce be bad unicode chars i= n >>> vim >>> :@ >>> [14:23] =3D=3D Frippe_ has changed nick to Frippe >>> [14:23] benoitc: I'm not sure exactly how that would work?= I'm >>> working on the backend for a web app, so the requests will be coming fr= om >>> multiple machines >>> [14:24] <+benoitc> papandreou: call with stale=3D=3Dok and have a proce= ss >>> asking >>> your deb for refresh from time to time >>> [14:24] <+benoitc> s/deb/view >>> [14:25] <+rnewson> benoitc: not sure I follow. doubling the number of v= iew >>> requests to achieve a finer etag is an ok solution, but shouldn't be th= e >>> default, but I do think we'd need a better solution than that. >>> [14:25] <+rnewson> benoitc: and you might be forgetting all the md5 >>> verification we do all the time. >>> [14:27] <+benoitc> rnewson: you don't need to call each views though >>> [14:27] <+benoitc> I don't see the arg about last one >>> [14:27] benoitc: Ah, ok, I understand now. Won't work very >>> well >>> for me, though, the web app is a single page thing that only asks for t= his >>> particular chunk of data once per session, so the ETag will probably ha= ve >>> changed anyway unless we accept day-old data. >>> [14:27] <+benoitc> anyway enotime to discuss about that , i'm on >>> anotherthing >>> [14:32] rnewson: But next step would be for me to raise th= e >>> issue on the user mailing list? >>> [14:33] <+rnewson> papandreou: on reflection, it's more a dev@ thing, b= ut >>> yes. >>> [14:33] <+rnewson> post the suggestion about calculating an etag over t= he >>> results and then streaming them, with the caveat that a better solution >>> should be found. >>> [14:34] rnewson: Ok, I will, thanks :). Btw. do you think >>> there's a chance that this will be easier for key=3D... queries than >>> arbitrary >>> startkey=3D...&endkey=3D... ones? >>> [14:35] <+rnewson> papandreou: yes. for key=3D we could use a bloom fil= ter. >>> [14:38] rnewson: Man, I've got some reading up to do :). >>> Thanks! So dev@ it is? >>> [14:39] <+rnewson> papandreou: yes. >>> [14:40] <+rnewson> papandreou: 'bloom filter' is just how we handwave >>> solutions these days, it just sounds vaguely plausible to for the keys= =3D >>> variant >>> [14:40] <+rnewson> but doesn't make sense at all for startkey/endkey >>> [14:40] <+jan____> haha, I'm sitting in an ""HTTP Architecture" session= , >>> and >>> all the two speakers do is tell the audience how CouchDB gets it all ri= ght. >>> [14:41] <+rnewson> at base, we'd want some cheap way to invalidate a ra= nge >>> of keys in memory. >>> [14:49] <+jan____> the answer must include bloom filters. >>> >> >