From dev-return-48590-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org Fri May 17 13:26:56 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id C663018060F for ; Fri, 17 May 2019 15:26:55 +0200 (CEST) Received: (qmail 15753 invoked by uid 500); 17 May 2019 13:26:54 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 15742 invoked by uid 99); 17 May 2019 13:26:54 -0000 Received: from Unknown (HELO mailrelay2-lw-us.apache.org) (10.10.3.159) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 May 2019 13:26:54 +0000 Received: from auth1-smtp.messagingengine.com (auth1-smtp.messagingengine.com [66.111.4.227]) by mailrelay2-lw-us.apache.org (ASF Mail Server at mailrelay2-lw-us.apache.org) with ESMTPSA id E915730CD for ; Fri, 17 May 2019 13:26:53 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailauth.nyi.internal (Postfix) with ESMTP id 6EC8C24893 for ; Fri, 17 May 2019 09:26:53 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 17 May 2019 09:26:53 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddruddtvddgieehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpefhtgfgggfuffhfvfgjkffosehtqh hmtdhhtdejnecuhfhrohhmpeftohgsvghrthcuufgrmhhuvghlucfpvgifshhonhcuoehr nhgvfihsohhnsegrphgrtghhvgdrohhrgheqnecuffhomhgrihhnpehnvghighhhsghouh hrhhhoohgurdhivgenucfkphepudekhedrvddvvddrvdejrddvgedunecurfgrrhgrmhep mhgrihhlfhhrohhmpehrnhgvfihsohhnodhmvghsmhhtphgruhhthhhpvghrshhonhgrlh hithihqdelfeegvddtvdejvddqudduleegjedtjeejqdhrnhgvfihsohhnpeeprghprggt hhgvrdhorhhgsehfrghsthhmrghilhdrfhhmnecuvehluhhsthgvrhfuihiivgeptd X-ME-Proxy: Received: from [198.18.12.82] (unknown [185.222.27.241]) by mail.messagingengine.com (Postfix) with ESMTPA id AC08710378 for ; Fri, 17 May 2019 09:26:52 -0400 (EDT) From: Robert Samuel Newson Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.8\)) Subject: Re: Design doc index switching Date: Fri, 17 May 2019 14:26:51 +0100 References: <00FEB1F0-E158-4403-AD95-778A7F2FAC49@medicmobile.org> <036BDF3D-2550-47D9-850A-7782661EF480@apache.org> <7BD55E0A-CA14-4610-A7C2-5E921D0382CB@apache.org> To: CouchDB Developers In-Reply-To: <7BD55E0A-CA14-4610-A7C2-5E921D0382CB@apache.org> Message-Id: X-Mailer: Apple Mail (2.3445.104.8) If the consuming code can=E2=80=99t handle the switchover, the user = should not be asking couchdb to build an incompatible index in the first = place, but certainly shouldn=E2=80=99t ask couchdb to replace an index = automatically (i.e, they should not use our proposed new handling at = all). On replication, the doc would replicate as normal, causing the new index = to build wherever it lands. Each couchdb instance would do the swap over = independently. To address conflicts, we could block edits to any design = doc with a _replaces item if the view is still building. Like for = replication docs, and similarly we=E2=80=99d only allow a delete, which = would delete the new index also. B. > On 17 May 2019, at 10:59, Jan Lehnardt wrote: >=20 >=20 >=20 >> On 16. May 2019, at 23:03, Robert Samuel Newson = wrote: >>=20 >> I suggest an alternative; the new design document could include the = _id of design document it=E2=80=99s replacing = (=E2=80=9C_replaces=E2=80=9D:=E2=80=9D_design/foo=E2=80=9D). On = completion of the view build of the new design document, CouchDB itself = updates the named _id to the same content as the new design document = (strictly, only the parts needed to make the view sig match) (perhaps it = also deletes the new document). >>=20 >> The advantage to this is that queries to the original design document = continue to work throughout and at no point is there a discrepancy = between the design documents contents (the map and reduce functions, = etc) and the results you get from it. >=20 >=20 > As Stefan points out, this only works if the consuming code knows how = to handle both index result formats. If not, we need a way for either = CouchDB to signal the build being ready or a way to confirm the swap to = coincide with a code deploy. >=20 >=20 >>=20 >> B. >>=20 >>> On 16 May 2019, at 14:55, Jan Lehnardt wrote: >>>=20 >>> +1 on solving this for all users, and same caveats as Stefan raises = :) >>>=20 >>>> On 16. May 2019, at 09:38, Stefan du Fresne = wrote: >>>>=20 >>>> Hey Garren, >>>>=20 >>>> Having this a native part of CouchDB seems like a really cool idea: = we have automated the manual dance you're talking about with our = deployment tooling, but it would be really nice not to have to! >>>>=20 >>>> I'm not clear how it would work though, at least in terms of = coherent deployments. View changes are, like SQL migrations, an often = non-backwards compatible change that has to occur as your new code = deploys. >>>>=20 >>>> Currently the naive approach is you deploy your new code alongside = design doc changes, which then block view queries on first request until = they're ready to go. >>>>=20 >>>> The better approach is what you describe, which is what we do now, = where we extract our design documents out of our deployment bundle and = place them in a "staging" location to allow them to warm, then rename = them and do the actual code deployment once that's complete (managed by = an external deployment service we built). This importantly lets us split = the "warming" bit from the deployment bit: we only deploy new code once = the design documents that are shipped with that code is ready to go. >>>>=20 >>>> How would you foresee this kind of flow happening here? Would there = be a way to query the design doc to know if it had flipped to the new = version yet? Would you be able to control when this flip occurs? Or = would the expectation be that your code handles both versions = gracefully? >>>>=20 >>>> As an example to mull over, let's say you have design doc v1, which = has view a. You push design doc v2, which has added view b, but has also = changed view a in some backwards incompatible way. While v2 is still = building and is not yet the active doc: >>>> - If you queried view a you'd get the v1 version, that's clear >>>> - If you queried view b you'd get... a 404? Some other custom code? >>>> - If you GET the design document what doc would you see? Presumably = v2? >>>> - Could you query something to determine which version is currently = active? Or perhaps just whether there is a background version building = at all? >>>>=20 >>>> Cheers, >>>> Stefan >>>>=20 >>>>> On 16 May 2019, at 07:51, Garren Smith wrote: >>>>>=20 >>>>> Hi Everyone, >>>>>=20 >>>>> A common pattern we see for updating large indexes that can take a = few days >>>>> to build, is create a new design docs with the new updated views. = Then once >>>>> the new design doc is built, a user changes the new design doc=E2=80= =99s id to the >>>>> old design doc. That way the CouchDB url for the views remain the = same and >>>>> any requests to the design doc url automatically get the latest = views only >>>>> once they built. >>>>>=20 >>>>> This is an effective way of managing building large indexes, but = the >>>>> process is quite complicated and often users get it wrong. I would = like to >>>>> propose that we move this process into CouchDB and let CouchDB = handle the >>>>> actual process. =46rom a users perspective, they would add a field = to the >>>>> options of a design document that lets CouchDB know, that this = build needs >>>>> to be built in the background and only replace the current index = once its >>>>> built: >>>>>=20 >>>>> ``` >>>>> { >>>>> "_id": "_design/design-doc-id", >>>>> "_rev": "2-8d361a23b4cb8e213f0868ea3d2742c2", >>>>> "views": { >>>>> "map-view": { >>>>> "map": "function (doc) {\n emit(doc._id, 1);\n}" >>>>> } >>>>> }, >>>>> "language": "javascript", >>>>> "options": { >>>>> "build_and_replace": true >>>>> } >>>>> } >>>>> ``` >>>>>=20 >>>>> I think this is something we could build quite effectively once we = have >>>>> CouchDB running on top of FoundationDB. I don=E2=80=99t want to = implement it for >>>>> version 1 of CouchDB on FDB, but it would be nice to keep this in = mind as >>>>> we build out the map/reduce indexes. >>>>>=20 >>>>> What do you think? Any issues we might have by doing this = internally? >>>>>=20 >>>>> Cheers >>>>> Garren >>>>=20 >>>=20 >>> --=20 >>> Professional Support for Apache CouchDB: >>> https://neighbourhood.ie/couchdb-support/ >>>=20 >>=20 >=20 > --=20 > Professional Support for Apache CouchDB: > https://neighbourhood.ie/couchdb-support/