couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: [DISCUSS] A direction from a non-contributor
Date Thu, 11 Jul 2019 08:59:46 GMT
Hi Chintan,

> On 10. Jul 2019, at 18:25, Chintan Mishra <chintan@rebhu.com> wrote:
> 
> On 09/07/19 9:33 PM, Joan Touzet wrote:
> 
>> Hi Chintan,
>> 
>> Reading through your proposal, I have one main point to make.
>> 
>> At the Apache Software Foundation, the people who lead the projects are
>> the people who do the work on them. We use the wrong word "meritocracy"
>> to explain this principle; a better word would be "do-ocracy."
>> 
>>   http://www.apache.org/foundation/how-it-works.html#decision-making
>>   https://incubator.apache.org/guides/participation.html#as_a_developer
>>   https://communitywiki.org/wiki/DoOcracy
>> 
>> That means that your project can completely proceed on its own if it
>> wants to; the only thing over which you're not in control is whether
>> that project gets to call itself CouchDB or not. That decision is
>> reached by the people who have built CouchDB into what it is today.
> I appreciate that you shared these links. I now understand what I have to do next.
>> -----
>> 
>> On that last point, there's a lot that would need to be done for you to
>> convince the PMC that your vision is the one, true future of CouchDB.
>> 
>> What you propose is both a significant rewrite, as well as requiring an
>> entirely new set of skills from the developer base (Rust, MQTT, Kotlin,
>> Swift).
> 
> From Slack conversations, it appears the community has some inclination towards building
a Rust based CouchDB some day. As for other technologies those changes are not happening today.
I do not propose to start with all the changes at once. Storage engine is a good place to
start.

Since I brought it up in Slack, let me clarify: I do not suggest that
we should move CouchDB to Rust today or any time later.

What I am suggesting is that we should look at the things required to
support your idea of an IoT-capable CouchDB-like thing. My suggestion
is to not change CouchDB, but to make a new CouchDB compatible project.

Devices are only getting smaller, so a lower level language is needed
to ensure performance and good battery use. That leaves C, C++, Go and
Rust.

When I’m looking at what likely people I could excite to contribute to
such a thing, in my filter bubble that is folks getting into Rust or
and the rest of the Rust community.

If it ends up being Go, C or C++, because someone who runs with this
prefers those, I don’t really mind.

* * *

In particular, we should look at a more detailed IoT use-case and how
CouchDB can help.

Correct me if I’m wrong, but this is mostly about devices with sensors
generating measurements over time that should be aggregated into a
cloud service for analysis.

In that world, a hypothetical API for an IoT app using our new RustyCouch
could look like this:

db = RustyCouch.open('file.foo’)

db.save(measurement)

db.push('https://cloud.measurements.com’)

repeat

This is a very small subset of the CouchDB API, but it would cover the
majority of your billion IoT use-cases.

There are a few things to be considered about data persistence and
concurrency control, but in another email, you already mentioned
SQLite, which solves most of those for you already.

db.save() would generate a JSON document with a uuid as _id and
corresponding _rev and an entry in an index that allows us to query,
at a later point: in what order were these docs written, which we are
going to need for db.push()

db.push() then opens that index, checks with the cloud which docs
it already has (as per the standard CouchDB replication protocol)
and then sends all local docs that aren’t on the cloud yet in a 
couple of _bulk_docs batches.

Voilá, a low-level, embeddable library that allows you to sync
stuff to CouchDB.

This is a scope that a single developer could make a prototype
of, even in a language that you are just starting out with.

With this in hand then, the next step is to talk to the folks
who build IoT platforms and applications to see if they want to
use something like that.

And once we have this, we can talk about changes to the replication
protocol.

* * *

If you want to take this further, and make a library that also
supports interactive querying, for say native applications on
phones and watches and whatnot, you already have a decent
foundation, but you’ll have a little more work to do.

* * *

But none of this requires changing CouchDB itself, or a 10 year
effort of porting something, while solving all the needs you
have.

* * *

Finally, I’d like to caution against being flippant about the
current project direction with FoundationDB. This is something
the team that has been doing this for over 10 years looked at
“in-depth” and decided it is the right thing to do.

The alternative would be to build a FoundationDB like thing
ourselves, which is a multi-million dollar investment that
I haven’t any one seen commit to at the moment.

In particular, I’m one of the champions for smaller CouchDB
installations in this project, and moving forward is always
a give-and-take. We are not in a position yet to gauge what
the problems are with an FDB-Couch for a single-node instance
but I’m sure going to work hard on making it easy for our
downstream users.

I’m the maintainer of the Mac binaries which are extremely
popular. Any database that can’t be set up with a download,
unzip and double click to start to get a dev environment is
going to have trouble attracting new developers. So I’ll make
sure we can retain this experience as much as possible.

* * *

Let’s be pragmatic and consider incremental change or small
scope side projects to move this forward. Grand visions,
in my experience almost never work out. The only reason I have
trust in an FDB transition is because someone with authority
and budget said “the team that ostensibly built CouchDB 2.x
is going to do this”. That’s the only way it could possible work.

And don’t mistake my RustyCouch suggestion about being
dismissive or sidelining. I’ve wanted something like this since
about 2008, and many people have tried with various attempts,
so my suggestion above is very serious *and* fed by the
experience or all these failed attempts.

CouchDB’s strength is its replication protocol. We didn’t
rewrite CouchDB in JavaScript because we suddenly realised
there are a billion browsers, but PouchDB came along with
a compatible data model and replication engine so that the
two projects complement each other perfectly and anyone on
the CouchDB will tell you that PouchDB is one of the biggest
drivers of CouchDB adoption.

How about we just re-run this strategy for IoT: build a
small thing that is useful for one use-case and make it work,
then make it more complicated to be useful for more use-cases.
At each point, make sure replication with CouchDB works.
That’s a winning strategy. We already know it.

Best
Jan
—


> 
>>  It is in direct competition with the proposal being worked on
>> this list for the FoundationDB backend swap. With the addition of MQTT,
>> it sounds like the entire replication protocol and methodology would
>> need to be revisited, as the semantic changes you're proposing would
>> break existing client replication.
> The HTTP replication protocol more or less remains the same in the foreseeable future.
A new MQTT replication strategy will be built upon the existing method. The two will not work
in parallel. Either one of these will work per database.
>> Finally, the proposal to push into
>> the mobile space would directly compete with our sister project PouchDB,
>> who have put in tens of thousands of development hours as well.
> The community will evolve at some point. And bringing people from sister project onto
CouchDB will allow faster development. The diagram in the proposal missed a part for Web Browser
based CouchDB. This missed part is an interface for JavaScript and CouchDB-Web Browser. So,
we will need some JavaScript developers too. And they can help improve Fauxton.
>>  This all
>> adds up to a much bigger scoped project than CouchDB is today, and I
>> daresay may be bigger than I think even you realize.
> 
> I do realize that I want CouchDB to be in a billion mobile and embedded device by 2025.
I understand this is a challenging scale. I brought this here because I see how much we need
a DB for a "Cluster Of Unreliable Commodity Hardware". I assume proposed path will take somewhere
between 18-21 months to come to fruition for a team of 15 people working 40 hours/week.
> 
>> With my PMC hat on, I have to ask:
>> 
>> * Do you already have developers versed in these skills you can bring to
>>   the project (beyond yourself)? Are they ready to commit the 40+ hours
>>   a week each to making it a reality?
> No, I do not have a team in place for this.
>> * Do you have experience in building a distributed system of this scale,
>>   using the specific technologies you propose?
> I have been reading about distributed systems. I want to take up an Open Source project
which solves replication problem for devices coming up with emerging technologies.CouchDB
is the best fit as it already solves theproblem of replication across remote devices.
>> * How do you plan to convince other developers of your approach
>>   specifically?
> 
> What got us(you) here, won't get us(you) there! -- Marshall Goldsmith
> 
> CouchDB led the way by being years ahead. This is just the same thing happening again
in a newer market. CouchDB is already great at replication. What I am proposing is taking
this simple-but-powerful methodology a step further and building it for planetary scale use-cases(idea
derived from Lasp-Lang).Here are some ways with which we drive more developers, users, and
eyes.
> 
> * Helping users realize that CouchDB lets them relax while building
>   applications for devices with any form factor.
> * Reaching out to the developers who have built their own solution for
>   replicating stuff from their device of any form factor to CouchDB
> * On-boarding developers who will become early adopter and test it out
>   on their IoT devices. Thus, proving an unmet market need.
> * Promoting offline-first strategy among mobile and embedded
>   developers will drive contributors from these communities.
> * Documenting comparisons between existing mobile and embedded
>   solutions which provide replication solutions like Realm, and CouchBase.
> 
>> * How do you intend to train up our existing developers on the new
>>   languages and technologies involved?
> If people are excited about the future they are building then this is a smaller problem
to tackle. People in this community when and if they come to a consensus about the proposal
then this can be tackled by 'Each one, teach one' followed by Yamaha Motors. This is a buddy
system where people get new partners to tackle a problem/PR. They share issues, their understanding
of the codebase and language, etc. with each other. As buddies rotate everyone gets on the
same page after a few cycles. I have found 3-pair buddy system works best in software.But
this may differ based on culture, language, timezone, and availability.
>> * How do you perceive the advantages and disadvantages of your approach
>>   *specifically* vs. the FDB approach already outlined?
> 
> Value addition (Horizontal) > >
> ----
> Proposal (Vertical) \/ \/
> 	
> Pros
> 
> 	
> Cons
> 
> FoundationDB
> 	
> 
> * Improving what works for majority of existing users
> * Iterates CouchDB to a better form
> * Prospect of immediate consistency for ACID transactions
> 
> 	
> 
> * Losing some small and mid-sized developers
> * Fragments community
> 
> 
> Polyglot-unification
> 	
> 
> * Growth by tapping newer prospects
> * Reduces fragmentation of user community and codebase
> * Reimagines CouchDB as if it was built in 2019
> 
> 	
> 
> * Tons of work
> * Uses RocksDB, overlooks FoundationDB migration
> 
> 
> Email with subject "CouchDb Rewrite/Fork" by 'Reddy B. <reddy.b@live.fr>' has mentioned
some other concerns. This proposal introduces a new story for CouchDB. This proposal would
require using RocksDB instead of FoundationDB.
> 
>> -Joan
>> 
>> On 2019-07-09 10:28, Chintan Mishra wrote:
>>> Hello team!!
>>> 
>>> Years of time and effort help move a product to the heights that CouchDB
>>> has reached. And as a non-contributor, rather a very new CouchDB
>>> user(1.5 years) who failed to find some relevant emails, I came up with
>>> a version of the future for CouchDB that I thought would help us grow.
>>> But Jan and Robert helped me realize that it takes a village to raise a
>>> child(CouchDB). So this is a proposal to find a middle ground from where
>>> we are headed and where the market is going next. The proposal I wrote
>>> was solely driven by what I have read over the years about the growth of
>>> the product and the community. I have attached the file or if you prefer
>>> reading in a browser, then click here<https://gitlab.com/snippets/1873543>.
>>> 
>>> It will roughly take 4-5 minutes of your time. A proposed direction is
>>> to start an entirely new project. That is not what I desire. I want to
>>> join the community behind CouchDB not build a new one using it. My goal
>>> from this proposal is to generate leverage by creating early mover
>>> advantage and help grow the community.
>>> 
>>> Thanking you.
>>> 
>>> --
>>> Chintan Mishra
>>> Rebhu Computing
>>> Founder and CEO

-- 
Professional Support for Apache CouchDB:
https://neighbourhood.ie/couchdb-support/


Mime
View raw message