couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alon Keren <alon.ke...@gmail.com>
Subject Re: Even more fine-grained ETag support when querying views?
Date Tue, 13 Sep 2011 09:48:09 GMT
I didn't know about Merkle Trees, but reading about them now makes the whole
thing seem less of a stab in the dark than I thought it was.
Thanks, Bob!

On 13 September 2011 12:00, Robert Newson <rnewson@apache.org> wrote:

> this pretty much requires us to write it.
>
> On 13 September 2011 09:50, Max Ogden <max@maxogden.com> wrote:
> > At this time I would like to suggest that when someone writes a Merkle
> tree
> > in Erlang that they call it Erkel.
> >
> > On Tue, Sep 13, 2011 at 1:46 AM, Robert Newson <rnewson@apache.org>
> wrote:
> >
> >> My joke about bloom filters was apparently misunderstood but the
> >> notion above, which sounds a lot like a Merkle tree, seems lucid to
> >> me.
> >>
> >> As for the strong vs. weak ETag variants, I think views need strong
> >> ETags in all cases, given the declared semantics for them in 13.3.3
> >>
> >> B.
> >>
> >> On 12 September 2011 23:28, Paul Davis <paul.joseph.davis@gmail.com>
> >> wrote:
> >> > In general the idea is intriguing. Using a combining hash would allow
> >> > you to get a specific hash value for a given range. Unfortunately,
> >> > bloom filters are not a good solution here because they require an a
> >> > priori guess of the number of keys that are going to be stored. On the
> >> > other hand, CRC32 appears to be combinable.There are a couple issues
> >> > though. The first of which is whether this is a strong enough hash to
> >> > use for an ETag. There are two types of ETags with slightly different
> >> > semantics, so we'd have to figure out what we can do and where this
> >> > falls on that spectrum. Secondly, computing the range ETag would
> >> > require the equivalent of a reduce=false view call in addition to
> >> > streaming the output if validation matched which has performance
> >> > implications as well.
> >> >
> >> > On Mon, Sep 12, 2011 at 6:50 PM, Alon Keren <alon.keren@gmail.com>
> >> wrote:
> >> >> Disclosure: I don't know much about e-tags, CouchDB internals (or
> bloom
> >> >> filters).
> >> >>
> >> >> How about maintaining an e-tag for each sub-tree in the view, similar
> to
> >> the
> >> >> way (I think) reduce works?
> >> >> When a row gets updated, its e-tag would be recalculated, and then
> its
> >> >> parent's e-tag would be recalculated, and so on. The e-tag of an
> >> internal
> >> >> node could be the hash of all its children's hashes.
> >> >> The actual e-tag that a view-query receives: the e-tag of the common
> >> >> ancestor of all involved rows.
> >> >>
> >> >> The next time you query the same keys, you would supply the e-tag
> you've
> >> >> just received.
> >> >>
> >> >>  Alon
> >> >>
> >> >>
> >> >> On 10 September 2011 16:41, Andreas Lind Petersen <
> >> >> andreaslindpetersen@gmail.com> wrote:
> >> >>
> >> >>> Hi!
> >> >>>
> >> >>> Background: I'm working on a web app that uses a single CouchDB
> >> database
> >> >>> for
> >> >>> storing data belong to 400000+ users. Each user has an average
of
> about
> >> 40
> >> >>> documents that need to be fetched in one go when the frontend is
> >> launched.
> >> >>> I
> >> >>> have accomplished this by querying a simple view with ?key=ownerID
> >> (with a
> >> >>> fallback to /_alldocs?startkey=<ownerID>_...&endkey=<ownerID>~
if
> the
> >> view
> >> >>> isn't built). Since the data for each user rarely changes, there's
a
> >> >>> potential to save resources by supporting conditional GET with
> >> >>> If-None-Match, which would amount having the web app backend copy
> the
> >> >>> CouchDB-generated ETag into the response sent to the browser.
> >> >>>
> >> >>> However, I just learned that CouchDB only maintains a single ETag
> for
> >> the
> >> >>> entire view, so every time one of my users changes something, the
> ETag
> >> for
> >> >>> everyone else's query result also changes. This makes conditional
> GETs
> >> >>> useless with this usage pattern.
> >> >>>
> >> >>> I asked about this on #couchdb and had a brief talk with rnewson,
> who
> >> was
> >> >>> sympathetic to the idea. Unfortunately we weren't able to come
up
> with
> >> an
> >> >>> idea that didn't involve traversing all docs in the result just
for
> >> >>> computing the ETag (my suggestion was a hash of the _revs of all
> docs
> >> >>> contributing to the result). That would be a bad default, but might
> >> still
> >> >>> work as an opt-in thing per request, eg. slowetag=true.
> >> >>>
> >> >>> Newson said I should try raising the discussion here in case someone
> >> else
> >> >>> had an idea for a cheaper way to calculate a good ETag. So what
does
> >> >>> everyone else think about this? Is my use case too rare, or would
it
> be
> >> >>> worthwhile to implement it?
> >> >>>
> >> >>> Best regards,
> >> >>> Andreas Lind Petersen (papandreou)
> >> >>>
> >> >>> Here's our chat transcript:
> >> >>>
> >> >>> [11:46] <papandreou> Does anyone know if there are plans
for issuing
> >> even
> >> >>> more granular etags for view lookups? When you only look up a small
> >> range
> >> >>> or
> >> >>> a specific key it would be really great if the ETag only changed
> when
> >> that
> >> >>> subset changes rather than the entire view.
> >> >>> [11:47] <papandreou> In the application I'm working on I'll
hardly
> ever
> >> be
> >> >>> able to get a 304 response because of this.
> >> >>> [...]
> >> >>> [13:51] <+rnewson> papandreou: unlikely.
> >> >>> [13:52] <papandreou> rnewson: So the best thing I can do
is to fetch
> >> the
> >> >>> data and compute a better etag myself? (My use case is a backend
for
> a
> >> web
> >> >>> app)
> >> >>> [13:53] <+rnewson> papandreou: You might be able to set ETag
in a
> list
> >> >>> function? If you can't, I'll gladly change CouchDB so you can.
> >> >>> [13:54] <papandreou> rnewson: I thought about that, too,
but that
> would
> >> >>> cause a big overhead for every request, right?
> >> >>> [13:55] <papandreou> rnewson: (Last time I tried views were
slooow)
> >> >>> [13:55] <papandreou> I mean lists
> >> >>> [13:55] <+rnewson> papandreou: slower, yes, because couch
needs to
> >> evaluate
> >> >>> the javascript in an external process.
> >> >>> [13:55] <+rnewson> how will you calculate the fine-grained
ETag?
> >> >>> [13:56] <+rnewson> Also we did recently make it slightly
finer,
> before
> >> it
> >> >>> was view group scope and now it's the view itself (I think)
> >> >>> [13:56] <papandreou> rnewson: Maybe something like a hash
of the
> _revs
> >> of
> >> >>> all the documents contributing to the result?
> >> >>> [13:56] <+rnewson> hm, that makes no sense actually. but
we did
> refine
> >> it
> >> >>> recently.
> >> >>> [13:57] <+rnewson> papandreou: that doesn't sound cheap at
all, and
> it
> >> >>> would
> >> >>> need to be cheaper than doing the view query itself to make sense.
> >> >>> [13:58] <papandreou> rnewson: There's still the bandwidth
thing
> >> >>> [13:58] <+rnewson> oh, you're working with restricted bandwidth
> and/or
> >> have
> >> >>> huge view responses?
> >> >>> [13:59] <papandreou> rnewson: And it would be really nice
to have
> >> something
> >> >>> like this completely handled by the database instead of inventing
a
> >> bunch
> >> >>> of
> >> >>> workarounds.
> >> >>> [14:01] <+rnewson> If there's a correct and efficient algorithm
for
> >> doing
> >> >>> it, I'm sure it would be applied.
> >> >>> [14:02] <papandreou> rnewson: I guess it depends on the use
case. If
> >> the
> >> >>> database is rarely updated I suppose the current tradeoff is better.
> >> >>> [14:03] <+rnewson> I'm sure the only reason we have ETags
at the
> >> current
> >> >>> granularity is because it's very quick to calculate. A finer-grain
> >> would be
> >> >>> committed if a viable approach was proposed.
> >> >>> [14:04] <papandreou> rnewson: I have a huge database with
data
> >> belonging to
> >> >>> 400000+ different users, and I'm using a view to enable a
> >> lookup-by-owner
> >> >>> thing. But every time a single piece of data is inserted, the ETag
> for
> >> the
> >> >>> view changes
> >> >>> [14:04] == case_ [~
> >> case@AMontsouris-651-1-123-169.w83-202.abo.wanadoo.fr]
> >> >>> has quit [Read error: Connection reset by peer]
> >> >>> [14:04] <+rnewson> yes, I've completely understood the problem
you
> >> stated
> >> >>> earlier.
> >> >>> [14:05] <+rnewson> I can't think of a way to improve this
right now
> but
> >> I
> >> >>> would spend the time to implement it if you had one.
> >> >>> [14:06] <papandreou> rnewson: So right now the code path
that sends
> a
> >> 304
> >> >>> only needs to look at a single piece of metadata for the view to
> make
> >> its
> >> >>> decision? That'll be hard to beat :)
> >> >>> [14:07] <+rnewson> doesn't need to beat it, it just needs
to be
> fast.
> >> >>> [14:07] <+rnewson> but I don't see any current possible solutions,
> let
> >> >>> alone
> >> >>> fast ones.
> >> >>> [14:07] <papandreou> rnewson: Well, thanks anyway for considering
my
> >> >>> suggestion. I'll let you know of I get an idea :)
> >> >>> [14:08] <+rnewson> and it is now per-view and not per-viewgroup.
so
> >> it's
> >> >>> what I said first before I thought it was silly
> >> >>> [14:08] <+benoitc> query + last seq returned maybe ....
> >> >>> [14:08] <+rnewson> but obviously a change could affect one
view in a
> >> group
> >> >>> but not others
> >> >>> [14:09] <papandreou> benoitc: The query is already sort of
included
> >> since
> >> >>> it's in the url.
> >> >>> [14:09] <+rnewson> benoitc: ?
> >> >>> [14:10] <+benoitc> i was meaning last committed seq,but it
won't
> change
> >> >>> anything ...
> >> >>> [14:10] <papandreou> benoitc: I guess you'd also need to
make sure
> that
> >> the
> >> >>> ETag changes if a document is deleted?
> >> >>> [14:10] <papandreou> ah
> >> >>> [14:10] <+rnewson> benoitc: we already use the update_seq
of the
> #view,
> >> >>> which is finer-grained that db's last committed seq
> >> >>> [14:11] <+benoitc> rnewson: commited seq in the view group
but
> anyway
> >> it
> >> >>> won't work
> >> >>> [14:12] <+rnewson> benoitc: right, that would be the pre-1.1.0
> >> behavior, I
> >> >>> think.
> >> >>> [14:12] <+rnewson> which is coarser
> >> >>> [14:12] <+rnewson> we simply don't record the info that papandreou's
> >> >>> suggestion would need to work.
> >> >>> [14:12] <+benoitc> papandreou: easier solution would be to
request
> each
> >> >>> time
> >> >>> on on stale view
> >> >>> [14:13] <papandreou> rnewson: Another reason why my suggestion
sucks
> is
> >> >>> that
> >> >>> it would require two traversals of the range, right? I'm guessing
it
> >> starts
> >> >>> streaming as soon as it has found the first doc now?
> >> >>> [14:13] <+benoitc> and update after, think it would work.
except if
> you
> >> >>> want
> >> >>> something strict
> >> >>> [14:13] <+rnewson> papandreou: yes, we stream the results
as we read
> >> them,
> >> >>> we don't buffer.
> >> >>> [14:14] <papandreou> benoitc: Hmm, so the theory is that
stale=ok
> would
> >> >>> increase the percentage of 304 responses?
> >> >>> [14:14] <papandreou> rnewson: Right, yes, then it would take
a
> serious
> >> hit.
> >> >>> [14:14] <+rnewson> papandreou: but we could add an option
that reads
> >> the
> >> >>> thing, builds an etag, and then streams the result. it would be
> slower,
> >> but
> >> >>> for the times that we can send 304 we'd save bandwidth. It sounds
a
> bit
> >> too
> >> >>> niche to me, but you could raise it on user@
> >> >>> [14:15] == Frippe [~Frippe@unaffiliated/frippe] has quit [Ping
> >> timeout:
> >> >>> 240
> >> >>> seconds]
> >> >>> [14:15] <papandreou> rnewson: Would be awesome to have that
as a
> >> >>> configuration option
> >> >>> [14:15] <+rnewson> papandreou: the view would not change,
so neither
> >> would
> >> >>> the ETag (with stale=ok)
> >> >>> [14:15] <+rnewson> papandreou: I think it would be a runtime
option
> >> >>> ?slow_etag=true
> >> >>> [14:15] <papandreou> rnewson: That would also be fine
> >> >>> [14:16] <+rnewson> a better solution would not require two
passes,
> >> though.
> >> >>> [14:16] <+benoitc> papandreou: i would use stale=ok, then
query the
> >> view
> >> >>> async, save new etag & ...
> >> >>> [14:16] <papandreou> rnewson: I really don't think it's that
niche
> :).
> >> But
> >> >>> maybe ETag-nerds are rarer than I think, hehe
> >> >>> [14:16] <+benoitc> rnewson: that could encourage pretty dangerous
> >> things
> >> >>> [14:16] <+rnewson> benoitc: ?
> >> >>> [14:17] <+benoitc> rnewson: cpu intensives tasks eacht time
the call
> is
> >> >>> done,
> >> >>> [14:17] <+benoitc> rather than encouraging something async
> >> >>> [14:18] <+benoitc> rahh I hate osx, it introduce be bad unicode
> chars
> >> in
> >> >>> vim
> >> >>> :@
> >> >>> [14:23] == Frippe_ has changed nick to Frippe
> >> >>> [14:23] <papandreou> benoitc: I'm not sure exactly how that
would
> work?
> >> I'm
> >> >>> working on the backend for a web app, so the requests will be coming
> >> from
> >> >>> multiple machines
> >> >>> [14:24] <+benoitc> papandreou: call with stale==ok and have
a
> process
> >> >>> asking
> >> >>> your deb for refresh from time to time
> >> >>> [14:24] <+benoitc> s/deb/view
> >> >>> [14:25] <+rnewson> benoitc: not sure I follow. doubling the
number
> of
> >> view
> >> >>> requests to achieve a finer etag is an ok solution, but shouldn't
be
> >> the
> >> >>> default, but I do think we'd need a better solution than that.
> >> >>> [14:25] <+rnewson> benoitc: and you might be forgetting all
the md5
> >> >>> verification we do all the time.
> >> >>> [14:27] <+benoitc> rnewson: you don't need to call each views
though
> >> >>> [14:27] <+benoitc> I don't see the arg about last one
> >> >>> [14:27] <papandreou> benoitc: Ah, ok, I understand now. Won't
work
> very
> >> >>> well
> >> >>> for me, though, the web app is a single page thing that only asks
> for
> >> this
> >> >>> particular chunk of data once per session, so the ETag will probably
> >> have
> >> >>> changed anyway unless we accept day-old data.
> >> >>> [14:27] <+benoitc> anyway enotime to discuss about that ,
i'm on
> >> >>> anotherthing
> >> >>> [14:32] <papandreou> rnewson: But next step would be for
me to raise
> >> the
> >> >>> issue on the user mailing list?
> >> >>> [14:33] <+rnewson> papandreou: on reflection, it's more a
dev@thing,
> >> but
> >> >>> yes.
> >> >>> [14:33] <+rnewson> post the suggestion about calculating
an etag
> over
> >> the
> >> >>> results and then streaming them, with the caveat that a better
> solution
> >> >>> should be found.
> >> >>> [14:34] <papandreou> rnewson: Ok, I will, thanks :). Btw.
do you
> think
> >> >>> there's a chance that this will be easier for key=... queries than
> >> >>> arbitrary
> >> >>> startkey=...&endkey=... ones?
> >> >>> [14:35] <+rnewson> papandreou: yes. for key= we could use
a bloom
> >> filter.
> >> >>> [14:38] <papandreou> rnewson: Man, I've got some reading
up to do
> :).
> >> >>> Thanks! So dev@ it is?
> >> >>> [14:39] <+rnewson> papandreou: yes.
> >> >>> [14:40] <+rnewson> papandreou: 'bloom filter' is just how
we
> handwave
> >> >>> solutions these days, it just sounds vaguely plausible to for the
> keys=
> >> >>> variant
> >> >>> [14:40] <+rnewson> but doesn't make sense at all for startkey/endkey
> >> >>> [14:40] <+jan____> haha, I'm sitting in an ""HTTP Architecture"
> >> session,
> >> >>> and
> >> >>> all the two speakers do is tell the audience how CouchDB gets it
all
> >> right.
> >> >>> [14:41] <+rnewson> at base, we'd want some cheap way to invalidate
a
> >> range
> >> >>> of keys in memory.
> >> >>> [14:49] <+jan____> the answer must include bloom filters.
> >> >>>
> >> >>
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message