From dev-return-48236-archive-asf-public=cust-asf.ponee.io@couchdb.apache.org Mon Feb 4 19:59:36 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 82CE9180651 for ; Mon, 4 Feb 2019 20:59:35 +0100 (CET) Received: (qmail 83655 invoked by uid 500); 4 Feb 2019 19:59:34 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 83644 invoked by uid 99); 4 Feb 2019 19:59:34 -0000 Received: from mail-relay.apache.org (HELO mailrelay2-lw-us.apache.org) (207.244.88.137) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Feb 2019 19:59:34 +0000 Received: from auth2-smtp.messagingengine.com (auth2-smtp.messagingengine.com [66.111.4.228]) by mailrelay2-lw-us.apache.org (ASF Mail Server at mailrelay2-lw-us.apache.org) with ESMTPSA id AF4A12ED1 for ; Mon, 4 Feb 2019 19:59:33 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailauth.nyi.internal (Postfix) with ESMTP id 61298202F2 for ; Mon, 4 Feb 2019 14:59:33 -0500 (EST) Received: from web4 ([10.202.2.214]) by compute4.internal (MEProxy); Mon, 04 Feb 2019 14:59:33 -0500 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedtledrkeeggddufedvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfquhhtnecuuegrihhlohhuthemucef tddtnecuogfuuhhsphgvtghtffhomhgrihhnucdlgeelmdenucfjughrpefkhffvggfgtg fojghfufffsehtqhertdertdejnecuhfhrohhmpeftohgsvghrthcupfgvfihsohhnuceo rhhnvgifshhonhesrghprggthhgvrdhorhhgqeenucffohhmrghinhepghhithhhuhgsrd hiohenucfrrghrrghmpehmrghilhhfrhhomheprhhnvgifshhonhdomhgvshhmthhprghu thhhphgvrhhsohhnrghlihhthidqleefgedvtddvjedvqdduudelgeejtdejjedqrhhnvg ifshhonheppegrphgrtghhvgdrohhrghesfhgrshhtmhgrihhlrdhfmhenucevlhhushht vghrufhiiigvpedt X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 99) id 0099FBA7E6; Mon, 4 Feb 2019 14:59:32 -0500 (EST) Message-Id: <1549310372.3461518.1650706552.15EAC694@webmail.messagingengine.com> From: Robert Newson To: dev@couchdb.apache.org MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" X-Mailer: MessagingEngine.com Webmail Interface - ajax-ec01da05 In-Reply-To: References: Subject: Re: [DISCUSS] : things we need to solve/decide : storing JSON documents Date: Mon, 04 Feb 2019 19:59:32 +0000 I've been remiss here in not posting the data model ideas that IBM worked u= p while we were thinking about using FoundationDB so I'm posting it now. Th= is is Adam' Kocoloski's original work, I am just transcribing it, and this = is the context that the folks from the IBM side came in with, for full disc= losure. Basics 1. All CouchDB databases are inside a Directory 2. Each CouchDB database is a Directory within that Directory 3. It's possible to list all subdirectories of a Directory, so `_all_dbs` i= s the list of directories from 1. 4. Each Directory representing a CouchdB database has several Subspaces; 4a. by_id/ doc subspace: actual document contents=20 4b. by_seq/versionstamp subspace: for the _changes feed=20 4c. index_definitions, indexes, ... JSON Mapping A hierarchical JSON object naturally maps to multiple KV pairs in FDB: {=20 =E2=80=9C_id=E2=80=9D: =E2=80=9Cfoo=E2=80=9D,=20 =E2=80=9Cowner=E2=80=9D: =E2=80=9Cbob=E2=80=9D,=20 =E2=80=9Cmylist=E2=80=9D: [1,3,5],=20 =E2=80=9Cmymap=E2=80=9D: {=20 =E2=80=9Cblue=E2=80=9D: =E2=80=9C#0000FF=E2=80=9D,=20 =E2=80=9Cred=E2=80=9D: =E2=80=9C#FF0000=E2=80=9D=20 }=20 } maps to (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cowner=E2=80=9D) =3D =E2=80=9Cbob=E2=80=9D= =20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cmylist=E2=80=9D, 0) =3D 1=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cmylist=E2=80=9D, 1) =3D 3=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cmylist=E2=80=9D, 2) =3D 5=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cmymap=E2=80=9D, =E2=80=9Cblue=E2=80=9D) = =3D =E2=80=9C#0000FF=E2=80=9D=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cmymap=E2=80=9D, =E2=80=9Cred=E2=80=9D) =3D= =E2=80=9C#FF0000=E2=80=9D NB: this means that the 100KB limit applies to individual leafs in the JSON= object, not the entire doc Edit Conflicts We need to account for the presence of conflicts in various levels of the d= oc due to replication. Proposal is to create a special value indicating that the subtree below our= current cursor position is in an unresolvable conflict. Then add additiona= l KV pairs below to describe the conflicting entries. KV data model allows us to store these efficiently and minimize duplication= of data: A document with these two conflicts: {=20 =E2=80=9C_id=E2=80=9D: =E2=80=9Cfoo=E2=80=9D,=20 =E2=80=9C_rev=E2=80=9D: =E2=80=9C1-abc=E2=80=9D,=20 =E2=80=9Cowner=E2=80=9D: =E2=80=9Calice=E2=80=9D,=20 =E2=80=9Cactive=E2=80=9D: true=20 } {=20 =E2=80=9C_id=E2=80=9D: =E2=80=9Cfoo=E2=80=9D,=20 =E2=80=9C_rev=E2=80=9D: =E2=80=9C1-def=E2=80=9D,=20 =E2=80=9Cowner=E2=80=9D: =E2=80=9Cbob=E2=80=9D,=20 =E2=80=9Cactive=E2=80=9D: true=20 } could be stored thus: (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cactive=E2=80=9D) =3D true=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cowner=E2=80=9D) =3D kCONFLICT=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cowner=E2=80=9D, =E2=80=9C1-abc=E2=80=9D) = =3D =E2=80=9Calice=E2=80=9D=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cowner=E2=80=9D, =E2=80=9C1-def=E2=80=9D) = =3D =E2=80=9Cbob=E2=80=9D So long as `kCONFLICT` is set at the top of the conflicting subtree this re= presentation can handle conflicts of different data types as well. Missing fields need to be handled explicitly: {=20 =E2=80=9C_id=E2=80=9D: =E2=80=9Cfoo=E2=80=9D,=20 =E2=80=9C_rev=E2=80=9D: =E2=80=9C1-abc=E2=80=9D,=20 =E2=80=9Cowner=E2=80=9D: =E2=80=9Calice=E2=80=9D,=20 =E2=80=9Cactive=E2=80=9D: true=20 } {=20 =E2=80=9C_id=E2=80=9D: =E2=80=9Cfoo=E2=80=9D,=20 =E2=80=9C_rev=E2=80=9D: =E2=80=9C1-def=E2=80=9D,=20 =E2=80=9Cowner=E2=80=9D: {=20 =E2=80=9Cname=E2=80=9D: =E2=80=9Cbob=E2=80=9D,=20 =E2=80=9Cemail=E2=80=9D: =E2=80=9C bob@example.com "=20 }=20 } could be stored thus: (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cactive=E2=80=9D) =3D kCONFLICT=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cactive=E2=80=9D, =E2=80=9C1-abc=E2=80=9D) = =3D true=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cactive=E2=80=9D, =E2=80=9C1-def=E2=80=9D) = =3D kMISSING=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cowner=E2=80=9D) =3D kCONFLICT=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cowner=E2=80=9D, =E2=80=9C1-abc=E2=80=9D) = =3D =E2=80=9Calice=E2=80=9D=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cowner=E2=80=9D, =E2=80=9C1-def=E2=80=9D, = =E2=80=9Cname=E2=80=9D) =3D =E2=80=9Cbob=E2=80=9D=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9Cowner=E2=80=9D, =E2=80=9C1-def=E2=80=9D, = =E2=80=9Cemail=E2=80=9D) =3D ... Revision Metadata * CouchDB uses a hash history for revisions=20 ** Each edit is identified by the hash of the content of the edit including= the base revision against which it was applied=20 ** Individual edit branches are bounded in length but the number of branche= s is potentially unbounded=20 * Size limits preclude us from storing the entire key tree as a single valu= e; in pathological situations=20 the tree could exceed 100KB (each entry is > 16 bytes)=20 * Store each edit branch as a separate KV including deleted status in a spe= cial subspace=20 * Structure key representation so that =E2=80=9Cwinning=E2=80=9D revision c= an be automatically retrieved in a limit=3D1=20 key range operation (=E2=80=9Cfoo=E2=80=9D, =E2=80=9C_meta=E2=80=9D, =E2=80=9Cdeleted=3Dfalse= =E2=80=9D, 1, =E2=80=9Cdef=E2=80=9D) =3D []=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9C_meta=E2=80=9D, =E2=80=9Cdeleted=3Dfalse= =E2=80=9D, 4, =E2=80=9Cbif=E2=80=9D) =3D [=E2=80=9C3-baz=E2=80=9D,=E2=80=9D= 2-bar=E2=80=9D,=E2=80=9D1-foo=E2=80=9D] <-- winner (=E2=80=9Cfoo=E2=80=9D, =E2=80=9C_meta=E2=80=9D, =E2=80=9Cdeleted=3Dtrue=E2= =80=9D, 3, =E2=80=9Cabc=E2=80=9D) =3D [=E2=80=9C2-bar=E2=80=9D, =E2=80=9C1-= foo=E2=80=9D] Changes Feed * FDB supports a concept called a versionstamp =E2=80=94 a 10 byte, unique,= monotonically (but not sequentially) increasing value for each committed t= ransaction. The first 8 bytes are the committed version of the database. Th= e last 2 bytes are monotonic in the serialization order for transactions.=20 * A transaction can specify a particular index into a key where the followi= ng 10 bytes will be overwritten by the versionstamp at commit time=20 * A subspace keyed on versionstamp naturally yields a _changes feed by_seq subspace=20 (=E2=80=9Cversionstamp1=E2=80=9D) =3D (=E2=80=9Cfoo=E2=80=9D, =E2=80=9C1-= abc=E2=80=9D)=20 (=E2=80=9Cversionstamp4=E2=80=9D) =3D (=E2=80=9Cbar=E2=80=9D, =E2=80=9C4-= def=E2=80=9D)=20 by_id subspace=20 (=E2=80=9Cbar=E2=80=9D, =E2=80=9C_vsn=E2=80=9D) =3D =E2=80=9Cversionstamp= 4=E2=80=9D=20 ...=20 (=E2=80=9Cfoo=E2=80=9D, =E2=80=9C_vsn=E2=80=9D) =3D =E2=80=9Cversionstamp= 1=E2=80=9D JSON Indexes * =E2=80=9CMango=E2=80=9D JSON indexes are defined by ** a list of field names, each of which may be nested,=20=20 ** an optional partial_filter_selector which constrains the set of docs tha= t contribute=20 ** an optional name defined by the ddoc field (the name is auto-generated i= f not supplied)=20 * Store index definitions in a single subspace to aid query planning=20 ** ((person,name), title, email) =3D (=E2=80=9Cname-title-email=E2=80=9D, = =E2=80=9C{=E2=80=9Cstudent=E2=80=9D: true}=E2=80=9D)=20 ** Store the values for each index in a dedicated subspace, adding the docu= ment ID as the last element in the tuple=20 *** (=E2=80=9Crosie revere=E2=80=9D, =E2=80=9Cengineer=E2=80=9D, =E2=80=9Cr= osie@example.com", =E2=80=9Cfoo=E2=80=9D) =3D null B. --=20 Robert Samuel Newson rnewson@apache.org On Mon, 4 Feb 2019, at 19:13, Ilya Khlopotov wrote: >=20 > I want to fix previous mistakes. I did two mistakes in previous=20 > calculations: > - I used 1Kb as base size for calculating expansion factor (although we=20 > don't know exact size of original document) > - The expansion factor calculation included number of revisions (it=20 > shouldn't) >=20 > I'll focus on flattened JSON docs model >=20 > The following formula is used in previous calculation.=20 > storage_size_per_document=3Dmapping_table_size*number_of_revisions +=20 > depth*number_of_paths*number_of_revisions +=20 > number_of_paths*value_size*number_of_revisions >=20 > To clarify things a little bit I want to calculate space requirement for= =20 > single revision this time. > mapping_table_size=3Dfield_name_size*(field_name_length+4(integer=20 > size))=3D100 * (20 + 4(integer size)) =3D 2400 bytes > storage_size_per_document_per_revision_per_replica=3Dmapping_table_size += =20 > depth*number_of_paths + value_size*number_of_paths =3D > 2400bytes + 10*1000+1000*100=3D112400bytes~=3D110 Kb >=20 > We definitely can reduce requirement for mapping table by adopting=20 > rnewson's idea of a schema. >=20 > On 2019/02/04 11:08:16, Ilya Khlopotov wrote:=20 > > Hi Michael, > >=20 > > > For example, hears a crazy thought: > > > Map every distinct occurence of a key/value instance through a crypto= hash > > > function to get a set of hashes. > > > > > > These can be be precomputed by Couch without any lookups in FDB. The= se > > > will be spread all over kingdom come in FDB and not lend themselves to > > > range search well. > > >=20 > > > So what you do is index them for frequency of occurring in the same s= et. > > > In essence, you 'bucket them' statistically, and that bucket id becom= es a > > > key prefix. A crypto hash value can be copied into more than one buck= et. > > > The {bucket_id}/{cryptohash} becomes a {val_id} > >=20 > > > When writing a document, Couch submits the list/array of cryptohash v= alues > > > it computed to FDB and gets back the corresponding {val_id} (the id = with > > > the bucket prefixed). This can get somewhat expensive if there's alw= ays a > > > lot of app local cache misses. > > > > > > A document's value is then a series of {val_id} arrays up to 100k per > > > segment. > > >=20 > > > When retrieving a document, you get the val_ids, find the distinct bu= ckets > > > and min/max entries for this doc, and then parallel query each bucket= while > > > reconstructing the document. > >=20 > > Interesting idea. Let's try to think it through to see if we can make i= t viable.=20 > > Let's go through hypothetical example. Input data for the example: > > - 1M of documents > > - each document is around 10Kb > > - each document consists of 1K of unique JSON paths=20 > > - each document has 100 unique JSON field names > > - every scalar value is 100 bytes > > - 10% of unique JSON paths for every document already stored in databas= e under different doc or different revision of the current one > > - we assume 3 independent copies for every key-value pair in FDB > > - our hash key size is 32 bytes > > - let's assume we can determine if key is already on the storage withou= t doing query > > - 1% of paths is in cache (unrealistic value, in real live the percenta= ge is lower) > > - every JSON field name is 20 bytes > > - every JSON path is 10 levels deep > > - document key prefix length is 50 > > - every document has 10 revisions > > Let's estimate the storage requirements and size of data we need to tra= nsmit. The calculations are not exact. > > 1. storage_size_per_document (we cannot estimate exact numbers since we= don't know how FDB stores it) > > - 10 * ((10Kb - (10Kb * 10%)) + (1K - (1K * 10%)) * 32 bytes) =3D 38K= b * 10 * 3 =3D 1140 Kb (11x) > > 2. number of independent keys to retrieve on document read (non-range q= ueries) per document > > - 1K - (1K * 1%) =3D 990 > > 3. number of range queries: 0 > > 4. data to transmit on read: (1K - (1K * 1%)) * (100 bytes + 32 bytes) = =3D 102 Kb (10x)=20 > > 5. read latency (we use 2ms per read based on numbers from https://appl= e.github.io/foundationdb/performance.html) > > - sequential: 990*2ms =3D 1980ms=20 > > - range: 0 > > Let's compare these numbers with initial proposal (flattened JSON docs = without global schema and without cache) > > 1. storage_size_per_document > > - mapping table size: 100 * (20 + 4(integer size)) =3D 2400 bytes > > - key size: (10 * (4 + 1(delimiter))) + 50 =3D 100 bytes=20 > > - storage_size_per_document: 2.4K*10 + 100*1K*10 + 1K*100*10 =3D 2024= K =3D 1976 Kb * 3 =3D 5930 Kb (59.3x) > > 2. number of independent keys to retrieve: 0-2 (depending on index stru= cture) > > 3. number of range queries: 1 (1001 of keys in result) > > 4. data to transmit on read: 24K + 1000*100 + 1000*100 =3D 23.6 Kb (2.4= x)=20=20 > > 5. read latency (we use 2ms per read based on numbers from https://appl= e.github.io/foundationdb/performance.html and estimate range read performan= ce based on numbers from https://apple.github.io/foundationdb/benchmarking.= html#single-core-read-test) > > - range read performance: Given read performance is about 305,000 rea= ds/second and range performance 3,600,000 keys/second we estimate range per= formance to be 11.8x compared to read performance. If read performance is 2= ms than range performance is 0.169ms (which is hard to believe). > > - sequential: 2 * 2 =3D 4ms > > - range: 0.169 > >=20 > > It looks like we are dealing with a tradeoff: > > - Map every distinct occurrence of a key/value instance through a crypt= o hash: > > - 5.39x more disk space efficient > > - 474x slower > > - flattened JSON model > > - 5.39x less efficient in disk space > > - 474x faster > >=20 > > In any case this unscientific exercise was very helpful. Since it uncov= ered the high cost in terms of disk space. 59.3x of original disk size is t= oo much IMO.=20 > >=20 > > Are the any ways we can make Michael's model more performant? > >=20 > > Also I don't quite understand few aspects of the global hash table prop= osal: > >=20 > > 1. > - Map every distinct occurence of a key/value instance through a c= rypto hash function to get a set of hashes. > > I think we are talking only about scalar values here? I.e. `"#/foo.bar.= baz": 123` > > Since I don't know how we can make it work for all possible JSON paths = `{"foo": {"bar": {"size": 12, "baz": 123}}}": > > - foo > > - foo.bar > > - foo.bar.baz > >=20 > > 2. how to delete documents > >=20 > > Best regards, > > ILYA > >=20 > >=20 > > On 2019/01/30 23:33:22, Michael Fair wrote:=20 > > > On Wed, Jan 30, 2019, 12:57 PM Adam Kocoloski > >=20 > > > > Hi Michael, > > > > > > > > > The trivial fix is to use DOCID/REVISIONID as DOC_KEY. > > > > > > > > Yes that=E2=80=99s definitely one way to address storage of edit co= nflicts. I > > > > think there are other, more compact representations that we can exp= lore if > > > > we have this =E2=80=9Cexploded=E2=80=9D data model where each scala= r value maps to an > > > > individual KV pair. > > >=20 > > >=20 > > > I agree, as I mentioned on the original thread, I see a scheme, that > > > handles both conflicts and revisions, where you only have to store th= e most > > > recent change to a field. Like you suggested, multiple revisions can= share > > > a key. Which in my mind's eye further begs the conflicts/revisions > > > discussion along with the working within the limits discussion becaus= e it > > > seems to me they are all intrinsically related as a "feature". > > >=20 > > > Saying 'We'll break documents up into roughly 80k segments', then try= ing to > > > overlay some kind of field sharing scheme for revisions/conflicts doe= sn't > > > seem like it will work. > > >=20 > > > I probably should have left out the trivial fix proposal as I don't t= hink > > > it's a feasible solution to actually use. > > >=20 > > > The comment is more regarding that I do not see how this thread can e= scape > > > including how to store/retrieve conflicts/revisions. > > >=20 > > > For instance, the 'doc as individual fields' proposal lends itself to= value > > > sharing across mutiple documents (and I don't just mean revisions of = the > > > same doc, I mean the same key/value instance could be shared for every > > > document). > > > However that's not really relevant if we're not considering the amoun= t of > > > shared information across documents in the storage scheme. > > >=20 > > > Simply storing documents in <100k segments (perhaps in some kind of > > > compressed binary representation) to deal with that FDB limit seems f= ine. > > > The only reason to consider doing something else is because of its im= pact > > > to indexing, searches, reduce functions, revisions, on-disk size impa= ct, > > > etc. > > >=20 > > >=20 > > >=20 > > > > > I'm assuming the process will flatten the key paths of the docume= nt into > > > > an array and then request the value of each key as multiple parallel > > > > queries against FDB at once > > > > > > > > Ah, I think this is not one of Ilya=E2=80=99s assumptions. He=E2=80= =99s trying to design a > > > > model which allows the retrieval of a document with a single range = read, > > > > which is a good goal in my opinion. > > > > > > >=20 > > > I am not sure I agree. > > >=20 > > > Think of bitTorrent, a single range read should pull back the structu= re of > > > the document (the pieces to fetch), but not necessarily the whole doc= ument. > > >=20 > > > What if you already have a bunch of pieces in common with other docum= ents > > > locally (a repeated header/footer/ or type for example); and you only= need > > > to get a few pieces of data you don't already have? > > >=20 > > > The real goal to Couch I see is to treat your document set like the > > > collection of structured information that it is. In some respects li= ke an > > > extension of your application's heap space for structured objects and > > > efficiently querying that collection to get back subsets of the data. > > >=20 > > > Otherwise it seems more like a slightly upgraded file system plus a f= ancy > > > grep/find like feature... > > >=20 > > > The best way I see to unlock more features/power is to a move towards= a > > > more granular and efficient way to store and retrieve the scalar valu= es... > > >=20 > > >=20 > > >=20 > > > For example, hears a crazy thought: > > > Map every distinct occurence of a key/value instance through a crypto= hash > > > function to get a set of hashes. > > >=20 > > > These can be be precomputed by Couch without any lookups in FDB. The= se > > > will be spread all over kingdom come in FDB and not lend themselves to > > > range search well. > > >=20 > > > So what you do is index them for frequency of occurring in the same s= et. > > > In essence, you 'bucket them' statistically, and that bucket id becom= es a > > > key prefix. A crypto hash value can be copied into more than one buck= et. > > > The {bucket_id}/{cryptohash} becomes a {val_id} > > >=20 > > > When writing a document, Couch submits the list/array of cryptohash v= alues > > > it computed to FDB and gets back the corresponding {val_id} (the id = with > > > the bucket prefixed). This can get somewhat expensive if there's alw= ays a > > > lot of app local cache misses. > > >=20 > > >=20 > > > A document's value is then a series of {val_id} arrays up to 100k per > > > segment. > > >=20 > > > When retrieving a document, you get the val_ids, find the distinct bu= ckets > > > and min/max entries for this doc, and then parallel query each bucket= while > > > reconstructing the document. > > >=20 > > > The values returned from the buckets query are the key/value strings > > > required to reassemble this document. > > >=20 > > >=20 > > > ---------- > > > I put this forward primarily to hilite the idea that trying to match = the > > > storage representation of documents in a straight forward way to FDB = keys > > > to reduce query count might not be the most performance oriented appr= oach. > > >=20 > > > I'd much prefer a storage approach that reduced data duplication and > > > enabled fast sub-document queries. > > >=20 > > >=20 > > > This clearly falls in the realm of what people want the 'use case' of= Couch > > > to be/become. By giving Couch more access to sub-document queries, I= could > > > eventually see queries as complicated as GraphQL submitted to Couch a= nd > > > pulling back ad-hoc aggregated data across multiple documents in a si= ngle > > > application layer request. > > >=20 > > > Hehe - one way to look at the database of Couch documents is that the= y are > > > all conflict revisions of the single root empty document. What I me= an be > > > this is consider thinking of the entire document store as one giant D= AG of > > > key/value pairs. How even separate documents are still typically rela= ted to > > > each other. For most applications there is a tremendous amount of da= ta > > > redundancy between docs and especially between revisions of those doc= s... > > >=20 > > >=20 > > >=20 > > > And all this is a long way of saying "I think there could be a lot of= value > > > in assuming documents are 'assembled' from multiple queries to FDB, w= ith > > > local caching, instead of simply retrieved" > > >=20 > > > Thanks, I hope I'm not the only outlier here thinking this way!? > > >=20 > > > Mike :-) > > >=20 > >=20