couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ilya Khlopotov" <iil...@ca.ibm.com>
Subject Re: CouchDB Summit Notes
Date Wed, 15 Feb 2017 19:14:03 GMT
Fantastic  notes!!!

While reading them I notice an Elixir related section. There is no
conclusion on this unfortunately. It also quite hard to infer the sentiment
from the notes. I would love to get a general idea on:
How many people present at the meeting see the value in adopting Elixir and
how many aren't?

BR,
ILYA




From:	Robert Samuel Newson <rnewson@apache.org>
To:	dev@couchdb.apache.org
Date:	2017/02/15 05:38 AM
Subject:	CouchDB Summit Notes



Hi,

A group of couchdb folks got together recently for a 3 day session (Feb 10,
11, 12) to discuss the couchdb roadmap. In attendance;

Russell Branca, Paul Davis, Dale Harvey, Adam Kocoloski, Nolan Lawson, Jan
Lehnardt, Gregor Martynus, Robert Newson, Garren Smith, Joan Touzet.

We took turns taking notes, and I present those notes without editing at
the end of this email for full disclosure. This effort represents the
beginning of the effort to define and execute a new couchdb roadmap and all
decisions will be made on this mailing list and/or in JIRA as per normal
ASF rules. It was enormously helpful to get a cohort together in one space
for a few days to thrash this out.

Friday and Saturday was primarily wide-ranging discussion and
Saturday/Sunday we got into more detailed conversations of the designs of a
few key new features we want to focus on for the next major release of
couchdb. I will summarise those two efforts first and the copious raw notes
will follow.

1. "hard" deletes

As everyone knows by now, a deleted document in couchdb is preserved
forever. They are typically small, but they do take up space, which makes
many uses of couchdb problematic and effectively excludes certain uses
entirely. The CouchDB attendees feel this should change, and that the new
CouchDB behaviour should be that a delete document eventually occupies no
space at all (post compaction).

The design for this is described in the notes and we whiteboarded it in
more detail, but the essential observation is this; CouchDB is free to
entirely forget a document once all secondary indexes have processed the
deletion and all replications have checkpointed past it. To allow third
parties to inter-replicate,  we will introduce an API to allow anyone to
inhibit this total deletion from occurring. The set of secondary indexes,
replication checkpoints and these as yet unnamed third party markers allows
couchdb to calculate an update sequence below which no deleted document
needs to be preserved. We called this the 'databasement' in our
conversations but a more obvious, but less amusing, name will doubtless
occur to us as we proceed.

This is obviously a huge (and, we hope, welcome) shift in couchdb semantics
and we want to get it right. There'll be a more detailed writeup in the
corresponding JIRA ticket(s) soon.

2. role-based access control

We challenged another long-standing couchdb convention; that access
controls are at the database level only. This leads to awkward workarounds
like the "db per user" pattern which does not scale well. We spent a lot of
time on this topic and believe we have a feasible and efficient design.

In brief, role-based access control will be available as an option (perhaps
only at database creation time, it's TBD if it can be toggled on or off for
existing databases). A document must be marked with the roles that allow
access, and users must be granted matching roles. We'll explicitly support
a richer control than mere matching, it will be possible to require
multiple roles for access. To make this work we will build an additional
index for efficient access control.

There's much more to be said of the amazing discussions we had at the
summit and I'll defer to the following notes for that, feel free to ask
questions in replies to this thread.

Robert Newson
Apache CouchDB PMC


CouchDB Summit Notes

1. Testing
		 • Shared test environment
		 • PouchDB test suite has value for CouchDB, but isn’t modular
		 • Outcome: Create unified modular pouchdb test framework to
use in CouchDB
2. Simpler API
		 • No more read API’s
		 • Dale: Lots of ways of doing read/writes - but no recommended
method
		 • Nolan: Differentiate between api endpoints that user’s use
and what is needed for replication, think /db/_replicate hiding all
replicator specific calls
		 • Nolan: Mango covers some APIs - get/all_docs/query
		 • rnewson: Map/reduce is never going to be removed
		 • SQL?
		 • Separate the replication endpoints and bulk endpoints from
‘normal’ endpoints
		 • rnewson: Moving to HTTP/2 need to reassess which endpoints
are necessary
		 • Outcome: Did we have an outcome here?
3. HTTP/2
		 • Jan: Cowboy not the best maintained project
		 • Jan: Need to touch base with Cowboy author first
		 • rnewson: value of Cowboy is rest module is already there
		 • Garren: Elixir stuff is all built on top of Cowboy
		 • Jan: Author of Cowboy is sponsored, maybe IBM can help to
influence direction
		 • rnewson: it’s possible, but CouchDB project shouldn’t have a
hard dep on $
		 • Jan: there is no possibility of starting from scratch
		 • Rnewson: we’ve never written middleware between us and
mochiweb
		 • Jan: it doesn’t do HTTP/2
		 • Rnewson: we get that for free if we switch to Cowboy
		 • Jan: Cowboy is still in development, still introducing
breaking changes
		 • Jan: Will talk to Loic
		 		 • State of HTTP2
		 		 • Erlang compat / deprecation
		 		 • Collaboration model
4. Packaging (detour)
		 • Nolan: managing Erlang versions is really hard today
		 • Rnewson: maybe we should deprecate the whitelist
		 • Nolan: nah better than failing for weird reasons at runtime
		 • Nolan: custom PPA?
		 • Jan/Rnewson: would need lots of resources for that
		 • Rnewson: ideally the bad Erlang version problem should be
going away eventually; the new releases are all good
		 • Jan: can we run tests against pre-release Erlang versions?
Need Cloudant scale traffic though
		 • Rnewson: instead of having rebar config whitelist we could
instead have an Eunit test that tests for the specific versions we know are
broken, outright fails, tells you why. Maybe a blacklist rather than a
whitelist. All the broken versions are in the past
		 • Rnewson: we mostly figured this stuff out in production
		 • Paul: also from reading the Erlang mailing lists, e.g. Basho
folks discovering bugs
		 • Jan: so we could try to Canary this but it’s unlikely to
work well
		 • Rnewson: Erlang themselves have improved their practices.
>From now on we can say “we support the latest n Erlang releases”
		 • Rnewson: Could ask Ericsson to run CouchDB against their
pre-release Erlang versions.
5. Sub-document operations
		 • Jan: this is a feature request since forever
		 • Jan: what won’t change is that we need to read the whole
JSON document from disk
		 • Rnewson: we do see Cloudant customers wanting to do this
		 • Jan: how does this work in PouchDB? Does json pointer map?
		 • Nolan: this can be done today, slashes or dots don’t matter,
but dots are more intuitive. Angular/Vue does that
		 • Jan: Ember too
		 • Nolan: I wouldn’t worry about this, just do whatever makes
sense for Couch and Pouch can add sugar
		 • Garren: we can design this incrementally
		 • Jan: let’s postpone the bikeshedding
		 • Garren: would this fit with Cowboy?
		 • Rnewson: yeah quite well, the paths into the objects map
well
		 • Garren: and this would just be used for getting some of the
content from a doc
		 • Rnewson: you’d still have to do the rev
		 • Rnewson: we could add that today

6. GraphQL

		 • Best done as a plugin

7. CouchDB and IBM/Cloudant Collaboration

		 • A community manager to help remove roadblocks
		 • Gregor: could be cross-project: Couch, Pouch, Hoodie
		 • Concern around if Cloudant doesn’t work on it no-one will
		 • Jan: weekly news taught me there is a *lot* going on in
Couch land
		 • Need more design-level discussions to happen in the open

8. Move _-fields out of JSON
		 • add future proof top level `_meta` (final name TBD), for
future meta data extensions
		 • Paul: adding any field means we have to think about
replication to old clients because they throw errors for any unknown _
field. But once we have _meta we’ll be in a better place

9. VDU in Mango
		 • Yes.

10. Rebalancing
		 • Jan: this is Couchbase’s killer feature

11. Proper _changes for views
		 • Jan: the idea is you can listen to a subset of changes to a
database
		 • Jan: could be a basis for selective replication
		 • Jan: Benoit wrote this as a fork for a customer
		 • Garren: with HTTP/2 can you stream the changes?
		 • All: yes
		 • Adam: there are so many implementations of this that allow
every possible use case
		 • Adam: can we just say “I want live refreshes?”
		 • Jan: let’s punt on this
		 • Rnewson: we know when we’re updating a view, we could just
send that out
		 • Possibly create a “live view” basically like a changes feed
but with out getting the guarantee of getting a full history but rather an
update since you joined.

12. Redesign security system
		 • Gregor: what initially drew me to Couch was that it had auth
built-in, very nice for building simple apps. As we made progress though we
found we just rewrote auth ourselves and used CouchDB only for DB
		 • Rnewson: hard to converge, no one clear winner
		 • Adam: implications on the Cloudant side as well. Cloudant
needs to align internally with the IBM Cloud Identity & Access Management
system. Need to be careful about chasing a moving target.
		 • Rnewson: Cowboy does give a clean API for this
		 • Nolan: big thing with current auth is that users frequently
get stuck on basic rather than cookie auth
		 • Garren: how to do offline-first auth?
		 • Nolan: no good answer right now
		 • Gregor: frequently users get invalidated towards the backend
asynchronously. Users continue working locally, become un-authenticated.
It’s more of a UI problem
		 • Gregor: ideally users don’t need to login at all to use the
app. Can use it locally w/o login
		 • Jan: How does this impact Pouch?
		 • Nolan: something streamlined would make the most sense, but
it’s okay if it’s not on by default. Users should be able to get up and
running quickly, but also graduate to a production app
		 • Rnewson: we’re also planning to do secure by default
		 • Jan: it should be easily tweakable whenever something new
like JWT comes along
		 • Garren: What does Mongo do?
		 • Jan: No other DB does this
		 • Rnewson: Or nobody uses it
		 • Jan: Like password forget. Doesn’t exist
		 • Adam: no upside to increasing complexity on DB backend
		 • Gregor: It’d be good to have a simple lowest-common
denominator (user/pass) in order to get going. Everything beyond that… e.g.
how do I get a session without a password? In Hoodie we implemented
CouchDB’s algo in JS
		 • Rnewson: well we’re not going to do unsecured by default
anymore
		 • Adam: most of this will probably not make it into Cloudant’s
API
		 • Garren: Given we don’t have 1000s of devs, it does help us
not have too much on our plates
		 • Adam: if there were a plugin to make this easy… in any case
it shouldn’t be the focus of Couch
		 • Nolan : it does play to couch’s strengths - the “http
database”
		 • Jan: yeah not trying to diminish but it would be a lot of
work… but Hoodie should not have to do so much work
		 • Adam: agree… could smooth over rough edges
		 • Nolan: maybe should wait for the plugins discussion then
		 • Break time

13. Mobile-first replication protocol
		 • Jan: When replication protocol was first designed this
wasn’t really a concern
		 • Jan: HTTP/2 fixes many of these issues
		 • Jan: may also need a way to do tombstone-less revisions
		 • Nolan: 80/90% of problems solved by HTTP2
		 • Testing would be great to have a Docker image with a HTTP2
proxy
		 		 • No improvement would not mean not worth doing.
		 • Revisit
		 • Nolan: primary use case for PouchDB is mobile, poor network
conditions. Currently HTTP 1 algo is very chatty, users complain about it
		 • Nolan: Need to validate with an HTTP/2 wrapper to see
improvement.
		 • Rnewson: Doesn’t disprove though. But might prove benefit.
14. Tombstone-less replication / tombstone deletion in database
		 • Rnewson: we see this a lot with Cloudant, often folks don’t
want deletions to replicate. It’s there for a good reason, there’s a
massive use case, but  it shouldn’t apply to everyone. There are use cases
where people want to delete data from a database. We’re starting to
implement some stuff already in Cloudant. Would prefer for it to be in
CouchDB. We’re implementing a clustered version of purge, currently only
way to do this. Might be driven by views. It’s hacky. Need a solution where
we say “don’t need to sync any further.”
		 • Gregor: revision history is different from purging
		 • Rnewson: what we’re doing is making purge first-class
		 • Gregor: from our position purging is a major thing. We just
don’t do it. If you want to end the replication and then share… it’s a
problem. Purging is what we need.
		 • Nolan: we don’t implement purge in PouchDB
		 • Rnewson: new one is just a clustered version of old one
probably. Needs to be safe across shards.
		 • Nolan: we don’t implement it because it’s hard. Have to make
sure views don’t show purged data
		 • Rnewson: similar reasons it’s hard across shards
		 • Rnewson: need to let others interact and not impact by
purged data
		 • Jan: replication could automatically skip tombstones
		 • Rnewson: should be able to add checkpoints and say ignore
deletions before this
		 • Chewbranca: what about replicated purge log?
		 • Rnewson: that’s what this is
		 • Chewbranca: exposed as a clustered level?
		 • Paul: no. not impossible to add it, but counterintuitive.
Kind of out of scope. Didn’t get into it. Changing the external HTTP
replicator frightens me. Lots extra to do. PouchDB compat etc.
15. Auto conflict resolution
		 • Jan: People don’t like writing conflict algorithms
		 • Paul: Hard when people delete data and then it replicates
back. Hoping purge will help.
		 • Rnewson: question is: how to do such that it doesn’t cause
conflicts between peers. Given same input need same output
		 • Jan: and no loops
		 • Rnewson: yes needs to converge
		 • Jan: CRDTs?
		 • Rnewson: yes
		 • Rnewson: idea is that things in CouchDB would have defined
solution to the conflict problem
		 • Rnewson: could say if you want to do it, conflicts need to
be first class
		 • Jan: there’s a paper on this. Read it on the flight. Nice
attempt but not production ready. Unbounded index problem.
		 • Chewbranca: CRDTs aren’t suitable to all data types. Typo in
two places in same doc, CRDTs can’t tell you what to do
16. Conflicts as first-class
		 • Paul: conflicts as first class talks about conflicts as a
graph instead of a tree. Resolution may introduce conflicts. I’d like to be
able to fix those.
		 • Gregor: what we’ve seen the most is one revision is a
delete… should always win. Weird when you delete something then it comes
back due to a conflict.
		 • Rnewson: idea of graphs is you say they join back… not just
branching trees
		 • Adam: you deleted one branch in server, one in client.
Deletion trumps
		 • Adam: it’s like a git merge. Person working on branch can
keep working. Changes from “undeletion, what the hell?”
		 • Chewbranca: need easier way to expose this to clients…
conflicts as first class doesn’t discuss this
		 • Rnewson: I think it’s an unpleasant experience when
conflicts are hidden until they scream. Exposing them is worse. Focus on
things that let you actually resolve the situation once you encounter it.
		 • Chewbranca: we need to be more aggressive. User can use
couchdb for a long time and never know about conflicts. Never have that
info exposed.
		 • Jan: what about always including conflicts
		 • Rnewson: nice middle ground. At least you’d notice the
problem
		 • Nolan: isn’t this going into Fauxton?
		 • Rnewson: I think it is. But it’s also just a nudge
		 • Gregor: we are ignoring conflicts mostly. But it’s a good
feeling that we’re building something that does everything right so we can
handle conflicts later. Something I’d wish is to make it easier to register
app-specific conflict resolution algos. E.g. deletion always wins.
		 • Rnewson: need to decide whether we just get rid of
conflicts. I think we’re stuck with them
		 • Chewbranca: what if you’re not doing backups
		 • Rnewson: That’s not where we’re going… want to give more
capabilities to handle it. There’s the graph so we can contract the tree,
and then some javascript function to do automatic conflict resolution.
Maybe some built-ins like we have for reduce. With enough docs this could
go away.
		 • Chewbranca: it’s also easy to make that conflicts view.
Should promote cron jobs
		 • Rnewson: don’t want to say “our killer feature is
multi-master replication… btw don’t use it”
		 • Chewbranca: I mean make it easy for the user to find it
		 • Jan: will revisit this
		 • Rnewson: the fundamental pieces are right. We replicate
around until something resolves it. We’ve done very little around tooling
and support and visibility.
		 • Chewbranca: Configurable “max conflicts”? 10 conflicts and
you’re done?

17. Selective sync
		 • Jan: selective sync is also big topic
		 • Jan: typical thing is “I want to build Gmail in Couch/Pouch
but I can’t”
		 • Jan: something could get archived after n days
		 • Chewbranca: you could accomplish a lot of that with view
filters for replication
		 • Chewbranca: timestamp view to query for a particular month,
replicate this view
		 • Paul: I was thinking of it as a filter more than as a view.
As Dale pointed out we can do it with filters
		 • Chewbranca: replication always walks entire changes feed
		 • Nolan: does that handle “sliding window”?
		 • Jan: yes
		 • Gregor: What about archiving? I always thought we could make
this work using filters. I could say “archive every doc with that property”
and those docs wouldn’t be synchronized, but someone said it doesn’t help
because it still starts from the beginning.
		 • Adam: you also want to start from newest and go back
		 • Nolan: this affects npm replication… old packages replicated
first
		 • Adam: we see this with gaming as well. Oldest stuff
replicated first
		 • Adam: what we’re realizing is that this particular “new to
oldest” replication is generally useful
		 • Adam: could also stop when we get to a certain age
		 • Rnewson: or mango… as soon as it stops matching, stop
replicating
		 • Adam: not saying reading changes backwards is easy… some
tricks
		 • Gregor: another thing, let’s say I open my email, first
thing I want to see is an overview, the subject and the first few
characters, meta-information. Usually this would be a view. But I want to
sync this view first and show it while the sync is still going.
		 • Jan: view could be server-side
		 • Jan: could be in a library though, not in core Couch
		 • Chewbranca: you can do this today if you know the update
sequence. The tricky bit is discovering the update seq. Another option is
make it easier to find that seq.
		 • Nolan: can do this today by forking pouchdb-replicate
package
		 • Adam: hardest is how to resume
		 • Chewbranca: could do newest to oldest with a limit, only
fetch the first 1000
18. Database archival
		 • Jan: Database archival… ignore this for now
		 • Chewbranca: want to export shards
		 • Joan: lots of people want rolling databases, or create a new
one every month, have to update their apps to use different db names. Could
also solve that problem. Old stuff goes away, aka sliding window.
		 • Adam: you got it
		 • Rnewson: so it partitions the database in some way and then
that drops out of the live db?
		 • Joan: let’s not bake in a specific backend. Could have
scripts for that
		 • Jan: need a format like mysqldump
		 • Jan: I like the idea of streaming that to a new DB
		 • Jan: In a couchdb 1 world can just read the couch file
		 • Adam: the use case Joan described is a telemetry store.
Recent data in Couch. Want to keep it but cheap files on disk. That’s a
continuous process. Like to have a TTL on the document. That’d different
than just exporting the DB
		 • Chewbranca: should talk about rollups for telemetry.
Metadata, hourly. Very common feature
		 • Adam: less common. I don’t think Couch should try to do it
		 • Chewbranca: I think it’s a good feature. But we can skip it
		 • Jan: let’s skip for now
		 • Jan: we agree we want something. Got some ideas
		 • Adam: “streaming” archive of documents that have outlasted
the TTL may be a different thing than a one-shot bulk archive. Both could
hopefully use the same format.
19. DB update powered replicator
		 • Jan: replicator database… not everything needs to be live
until written to
		 • Rnewson: problem is the scheduler? Might define 1000 jobs at
once? We’re working on that. Big project we started just before 2.0 was out
the door. Started in the Apache repository.
		 • Adam: Jan’s also talking about being able to drive
replication to a large number of DBs
		 • Rnewson: it’ll decouple the declaration of replication doc
from when it runs
		 • Rnewson: should include Jan in db core team to talk about
scheduler
		 • Rnewson: “replication scheduling” maybe?
		 • Gregor: I’d like to have 10000 databases with 10000 defined
replications
		 • Rnewson: exactly the problem we’re tackling
		 • Rnewson: scheduler has a thing where it examines work and
sees if it’s worth running, e.g. if something hasn’t changed. It’s not that
smart yet
		 • Chewbranca: event-based push replication? How about that?
		 • Rnewson: perhaps, it’s in the roadmap. Say you’re Cloudant,
we have lots of accounts, every cluster gets its own connection, that’s
silly
		 • Chewbranca: yes but also could incorporate into doc updates.
If there are outgoing replications, post out directly
		 • Rnewson: I dunno
		 • Rnewson: that’s what the db updates piece is. There and then
it tells the scheduler it’s worth replicating
		 • Rnewson: we care about wasted connections, resources. Want
to avoid situation where database I’ve hosted somewhere if it hasn’t
updated I won’t replicate. Stop those jobs entirely. Timebox them
Consistent databases
		 • Jan: consistent databases, will skip
		 • Rnewson: let’s talk about it
		 • Adam: databases that never have conflicts. Only exactly one
version of document
		 • Rnewson: strong consistency
		 • *not sure*: Opposite of what Couch does today then?
		 • Garren: like RethinkDB
		 • *crosstalk*
		 • Jan: Nick brought this up
		 • Rnewson: what you do need is eventually consistency. 10
nodes is 10 separate configs
		 • Chewbranca: lack of eventual consistency is real problem
		 • Rnewson: can solve that as with the dbs database
		 • Adam: we have such weak query capabilities across databases.
If it were db level it might be a fairly common use case, 99% of the
records in particular DB can be eventually consistent. Documents with
particular attributes could be targeted. Could push it to the doc level
		 • Adam: could guarantee for certain docs with certain
attribute that they never have conflicts
		 • Chewbranca: I think it’d be good for the dbs db to be
consistent. One way or another that’s a major problem. Conflicts in the dbs
db is terrible
Pluggable storage engine
		 • Jan: next: pluggable storage engine
		 • Paul: almost done. Need to test with before and after
pluggable storage engines in the same cluster for rolling reboots. Been a
day and two time zones since I looked at it. Had a bug in the test suite.
Getting ready to pull the mega PR button trigger.
		 • Paul: been confusing recently. Basically it’s a refactor of
the internals to give us an API. No new storage engine. Alternate
open-source implementation to prove it’s not over-specified. Merging would
create a couple config things. Goal is to let people play with it. No new
storage engine. No changing of data. All old dbs still work fine.
		 • Jan: if we can document this well we can get lots more
Erlang folks
		 • Nolan: we do this in Pouch, it’s not recommended though to
use Mongo etc.
		 • Joan: is Paul’s thing open source?
		 • Paul: I used a nif (?) to do the file I/O, couple file
optimizations, want to minimize number of times we write doc info to disk.
Uses three files per DB. Took me two days. Close to the legacy storage
engine but sufficiently different to prove API isn’t overly specified. Will
show in PR. Couple corner cases.
		 • Jan: opportunity for a storage engine that doesn’t trade
everything for disk space, but has the consistency. May be okay for certain
types of uses.
		 • Paul: lots of cool things to play with. Per-database
encryption keys instead of filesystem encryption. In-memory for testing and
playing
		 • Jan: please
		 • Paul: as soon as we have the API we can do lots of things
		 • Garren: interesting to compare to Couchbase
		 • Rnewson: when will the PR be ready?
		 • Paul: hopefully next week. Need to rebase. Test suite
passes. Want to set up a cluster with and without just to make sure. Set up
a mixed cluster.
		 • Adam: need to do something about attachments? Needs to store
arbitrary blobs.
		 • Paul: for attachments that is the only optional thing I
wrote in to it. If you have a storage engine that doesn’t store attachments
you can throw a specific error. Otherwise it’s an abstract API mimicking
how we do things now
Mango adding reduce
		 • Jan: Mango adding reduce
		 • Jan: goal is to add default reducers to Mango
		 • Rnewson: isn’t the goal with Mango that it looks like Mongo?
		 • Adam: doesn’t have to
		 • Garren: we keep seeing Mango/Mongo, is there a different
query language we want to do?
		 • Jan: database query wars?
		 • Jan: either mango or SQL
		 • Nolan: been pushing Mango for IDB at W3C. tell me if you
hate it
		 • Rnewson: don’t hate it, but goal is to make Couch more
accessible
		 • Jan: I’m totally fine
		 • Rnewson: just saying are we cleaving to this. Is this why
Mango exists, because of Mongo?
		 • Rnewson: similar but not identical is okay
		 • Chewbranca: reason we don’t want to promote Mango?
		 • Jan: we’re doing it
		 • Adam: it’s got a bunch of traction behind it
		 • Chewbranca: if it’s good enough, we should go with it
		 • Garren: it can only do a certain number of docs though?
		 • Paul: there is a limit. We talked other day about sort.
There will be a limit for that as well. Biggest downside of Mango is people
think it’s smarter than it is. Comes from Mongo. E.g. their sort has a 32MB
cap. Works until it doesn’t.
		 • Jan: this is about declarative form of existing reduce
		 • Jan: mango is currently only a map function
		 • Garren: best way to learn how people use Mango is to see
pouchdb-find issues. People start using it, then they ask questions. Once
you know Mango is map/reduce with sugar then you kind of get it. But if you
don’t get it you struggle. Making it more intuitive saves me time.
		 • Rnewson: yeah people assume it’s like Mongo
		 • Jan: why isn’t this the primary way to interact with Couch.
We need to take it seriously
		 • Rnewson: we should be saying that
		 • Jan: we had an out-of-the-blue contribution to this recently
Break
		 • Some talk about Couch-Chakra: seems Chakra would be easy to
embed, runs on Windows (ARM/x86/x64), Ubuntu (x64), MacOS (x64):
https://github.com/Microsoft/ChakraCore
Mango: adding JOINs
		 • Jan: fake the pattern of joining documents, once we have
that (foreign doc idea) could also have a foreign view key
		 • Jan: linked document is the first thing
		 • Chewbranca: could potentially get view collation in Mango as
well
		 • Nolan: is there anything in MR we don’t want to add to
Mango?
		 • Jan: arbitrary JS, aside from that…
		 • Nolan: people do want computed indexes though
		 • Jan: people want to define a mango index that you can run a
query against, but what goes into the index is limited by the expression…
people want to write js/erl etc. with a map function but they query with
mango. Given they use the right schema to produce the right data
		 • Paul: mango applies start/end key automatically. When you
get into computed I don’t know how that would work. Could scan the entire
index or alias it
		 • Paul: key is an array with the secret sauce that maps to an
index. Selector has “age greater than 5” it knows that the first element of
the array is age
		 • Jan: whatever the custom map function is
		 • Nolan: issue 3280. Garren you didn’t find this compelling?
		 • Garren: not sure what they’re trying to achieve
		 • Jan: regardless there’s a bunch of stuff we can get to later
Mango result sorting
		 • Jan: result sorting
		 • Jan: question is I’m reducing a thing, sort by the reduce
value. Standard databasey stuff. Problem of being an unbounded operation.
Current policy is: CouchDB doesn’t have any features that stop working when
you scale. Current non-scaling features are no longer being worked on. Want
to be more dev-friendly though.
		 • Garren: problem is it’s cool, people will use it, but you
have to limit the results. But if you don’t put it in people will get
frustrated
		 • Paul: algorithmically we don’t provide a way to do it inside
of the DB. Nothing we can do inside of the cluster user can’t do locally.
Sort by age? Can’t paginate through that. Have to buffer the entire result
set. Only pertains when you don’t have the sort field indexed.
		 • Rnewson: all these Mango additions are features that work
cosmetically. Not a Mango thing so much as a view layer thing. Needs to
scale. If I have a million things in my view, and I want it sorted by
another field, don’t want server to run out of memory. Or we don’t do it.
		 • Chewbranca: why not make another view?
		 • Rnewson: what you find in other DBs is new kinds of index
structures. Elaborate. All the magic. We don’t do any of that. Maybe we’re
getting away with it and that’s a good thing
		 • Jan: classic example is game scoring
		 • Rnewson: want to sort by value
		 • Rnewson: maybe multi-view, probably a new format.
		 • Nolan: maybe changes on views solves this? Then the view is
a DB, can do views of views
		 • Rnewson: problem of tombstones though. Also we said we
wouldn’t do this
		 • Nolan: that’s how pouchdb map/reduce works today
		 • Jan: could cover this in the tombstone discussion
		 • Rnewson: having it as replication source, that’s a sharp end
of the impl. There’s a lot less we could do for changes feed of the view
		 • Chewbranca: nothing preventing us from doing double item
changes
		 • Rnewson: there’s bits and pieces for this that we said we’re
not going to do because it’s complex, so should excise
Bitwise operations in Mango
		 • Jan: bitwise
		 • Jan: user request, like a greater than, equals, you also
want bitwise operations
		 • Rnewson: fine
NoJS mode
		 • Jan: noJS mode
		 • Jan: consensus around table already probably
		 • Jan: would be nice to use 80% of CouchDB without JS, using
Mango as primary query, won’t include document update functions
		 • Jan: update functions will go away because we have
sub-document operations. Just query and validation functions
		 • Joan: one of the things I’ve seen update functions used for
that this proposal doesn’t resolve is server-added timestamps to the
document
		 • Jan: not a fan. It’s still optional. You don’t have to go
through the update function to put the document. App server can do this
		 • Rnewson: is it important for noJS mode? Could make that a
feature.
		 • Joan: it is certainly one option. Been talked about a lot.
Auto timestamps.
		 • Garren: what about clustering?
		 • Rnewson: not important for multi-master replication
		 • Jan: have different TSs on different nodes
		 • Rnewson: has to be passed from coordinating node down to the
fragment
		 • Jan: there’s a Couchbase feature coming where the clients
know which shard to talk to, reducing latency. Same could be done for us
		 • Chewbranca: tricky thing here is supporting filter functions
		 • Garren: can’t do it with mango?
		 • *crosstalk*
		 • Jan: can do that already
		 • Rnewson: not doing the couchappy stuff
		 • Garren: need a baseline. What do we need that can run

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message