couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chintan Mishra <chin...@rebhu.com>
Subject Re: [DISCUSS] A direction from a non-contributor
Date Thu, 11 Jul 2019 09:55:16 GMT
On 11/07/19 2:52 PM, Jonathan Hall wrote:

> This looks like a good time for me to chime in.
>
> I'm the author of Kivik (http://kivik.io/), which is CouchDB/PouchDB 
> driver layer for Go/GopherJS.
>
> Since the beginning, I've had an interest in developing additional 
> tooling for CouchDB, primarily to ease development and administration 
> of CouchDB.  But some of my goals I think could also lend themselves 
> to embedded systems, quite easily.
>
> This is not to say anyone should prefer my approach over one written 
> in Rust. I'm only offering my own knowledge and perspective.
>
> I've recently had some interest from other Go developers in 
> contributing to the project, so to help explain my vision, I recently 
> wrote a blog post, which could be relevant here, too: 
> http://kivik.io/vision
>
> My long-term vision for Kivik is to allow different pluggable storage 
> backends (currently: CouchDB, PouchDB, and a mock driver, and very 
> limited filesystem- and memory-backed drivers), as well as pluggable 
> "front ends" (currently: whatever custom app you write, as well as a 
> test suite used to test the rest of the stack).
>
> If/when an HTTP server frontend is created, and any backend suitable 
> for an embedded system (such as SQLite, or even the filesystem) 
> reaches maturity, it would become quite possible to run a 
> self-contained Couch-compatible server on a minimalistic system. 
> Perhaps even an MQTT frontend could be created for Kivik, if there's a 
> demand for such a thing.
>
> Of course, there's always the question of performance and reliability. 
> As my primary goal is _testability_, this naturally leads to a 
> different level of reliability and scalability than will probably be 
> wanted by users of embedded systems. But I can't imagine making this 
> software suitable for embedded systems would hurt the usability for 
> testing. So if anyone is interested in contributing toward Kivik with 
> embedded systems in mind, I would welcome it! I also don't have any 
> current plans to support Mango queries or traditional views for my 
> testing backends (in-memory or filesystem-backed storage), as these 
> are not essential for most testing. For these to be useful in an 
> embedded system, someone might want/need to tackle this area.
>
> In the short term, my personal focus is on tooling around developing 
> and managing CouchDB. As mentioned in the blog post, a project I want 
> to start soon is a command-line tool for interacting with CouchDB. I 
> would gladly welcome feedback from anyone who would find such a tool 
> useful, to help prioritize features, and to design a usable interface. 
> I think this project is a logical next step, even for a possible 
> future goal of targeting embedded systems, as this CLI tool will be a 
> great way to flesh out and stress-test the filesystem driver.
>
> Jonathan

Does Kivik aim to become UnQL-alternative for any database independent 
of its design?

--
Chintan
>
>
> On 7/11/19 10:59 AM, Jan Lehnardt wrote:
>> Hi Chintan,
>>
>>> On 10. Jul 2019, at 18:25, Chintan Mishra <chintan@rebhu.com> wrote:
>>>
>>> On 09/07/19 9:33 PM, Joan Touzet wrote:
>>>
>>>> Hi Chintan,
>>>>
>>>> Reading through your proposal, I have one main point to make.
>>>>
>>>> At the Apache Software Foundation, the people who lead the projects 
>>>> are
>>>> the people who do the work on them. We use the wrong word 
>>>> "meritocracy"
>>>> to explain this principle; a better word would be "do-ocracy."
>>>>
>>>> http://www.apache.org/foundation/how-it-works.html#decision-making
>>>> https://incubator.apache.org/guides/participation.html#as_a_developer
>>>>    https://communitywiki.org/wiki/DoOcracy
>>>>
>>>> That means that your project can completely proceed on its own if it
>>>> wants to; the only thing over which you're not in control is whether
>>>> that project gets to call itself CouchDB or not. That decision is
>>>> reached by the people who have built CouchDB into what it is today.
>>> I appreciate that you shared these links. I now understand what I 
>>> have to do next.
>>>> -----
>>>>
>>>> On that last point, there's a lot that would need to be done for 
>>>> you to
>>>> convince the PMC that your vision is the one, true future of CouchDB.
>>>>
>>>> What you propose is both a significant rewrite, as well as 
>>>> requiring an
>>>> entirely new set of skills from the developer base (Rust, MQTT, 
>>>> Kotlin,
>>>> Swift).
>>>  From Slack conversations, it appears the community has some 
>>> inclination towards building a Rust based CouchDB some day. As for 
>>> other technologies those changes are not happening today. I do not 
>>> propose to start with all the changes at once. Storage engine is a 
>>> good place to start.
>> Since I brought it up in Slack, let me clarify: I do not suggest that
>> we should move CouchDB to Rust today or any time later.
>>
>> What I am suggesting is that we should look at the things required to
>> support your idea of an IoT-capable CouchDB-like thing. My suggestion
>> is to not change CouchDB, but to make a new CouchDB compatible project.
>>
>> Devices are only getting smaller, so a lower level language is needed
>> to ensure performance and good battery use. That leaves C, C++, Go and
>> Rust.
>>
>> When I’m looking at what likely people I could excite to contribute to
>> such a thing, in my filter bubble that is folks getting into Rust or
>> and the rest of the Rust community.
>>
>> If it ends up being Go, C or C++, because someone who runs with this
>> prefers those, I don’t really mind.
>>
>> * * *
>>
>> In particular, we should look at a more detailed IoT use-case and how
>> CouchDB can help.
>>
>> Correct me if I’m wrong, but this is mostly about devices with sensors
>> generating measurements over time that should be aggregated into a
>> cloud service for analysis.
>>
>> In that world, a hypothetical API for an IoT app using our new 
>> RustyCouch
>> could look like this:
>>
>> db = RustyCouch.open('file.foo’)
>>
>> db.save(measurement)
>>
>> db.push('https://cloud.measurements.com’)
>>
>> repeat
>>
>> This is a very small subset of the CouchDB API, but it would cover the
>> majority of your billion IoT use-cases.
>>
>> There are a few things to be considered about data persistence and
>> concurrency control, but in another email, you already mentioned
>> SQLite, which solves most of those for you already.
>>
>> db.save() would generate a JSON document with a uuid as _id and
>> corresponding _rev and an entry in an index that allows us to query,
>> at a later point: in what order were these docs written, which we are
>> going to need for db.push()
>>
>> db.push() then opens that index, checks with the cloud which docs
>> it already has (as per the standard CouchDB replication protocol)
>> and then sends all local docs that aren’t on the cloud yet in a
>> couple of _bulk_docs batches.
>>
>> Voilá, a low-level, embeddable library that allows you to sync
>> stuff to CouchDB.
>>
>> This is a scope that a single developer could make a prototype
>> of, even in a language that you are just starting out with.
>>
>> With this in hand then, the next step is to talk to the folks
>> who build IoT platforms and applications to see if they want to
>> use something like that.
>>
>> And once we have this, we can talk about changes to the replication
>> protocol.
>>
>> * * *
>>
>> If you want to take this further, and make a library that also
>> supports interactive querying, for say native applications on
>> phones and watches and whatnot, you already have a decent
>> foundation, but you’ll have a little more work to do.
>>
>> * * *
>>
>> But none of this requires changing CouchDB itself, or a 10 year
>> effort of porting something, while solving all the needs you
>> have.
>>
>> * * *
>>
>> Finally, I’d like to caution against being flippant about the
>> current project direction with FoundationDB. This is something
>> the team that has been doing this for over 10 years looked at
>> “in-depth” and decided it is the right thing to do.
>>
>> The alternative would be to build a FoundationDB like thing
>> ourselves, which is a multi-million dollar investment that
>> I haven’t any one seen commit to at the moment.
>>
>> In particular, I’m one of the champions for smaller CouchDB
>> installations in this project, and moving forward is always
>> a give-and-take. We are not in a position yet to gauge what
>> the problems are with an FDB-Couch for a single-node instance
>> but I’m sure going to work hard on making it easy for our
>> downstream users.
>>
>> I’m the maintainer of the Mac binaries which are extremely
>> popular. Any database that can’t be set up with a download,
>> unzip and double click to start to get a dev environment is
>> going to have trouble attracting new developers. So I’ll make
>> sure we can retain this experience as much as possible.
>>
>> * * *
>>
>> Let’s be pragmatic and consider incremental change or small
>> scope side projects to move this forward. Grand visions,
>> in my experience almost never work out. The only reason I have
>> trust in an FDB transition is because someone with authority
>> and budget said “the team that ostensibly built CouchDB 2.x
>> is going to do this”. That’s the only way it could possible work.
>>
>> And don’t mistake my RustyCouch suggestion about being
>> dismissive or sidelining. I’ve wanted something like this since
>> about 2008, and many people have tried with various attempts,
>> so my suggestion above is very serious *and* fed by the
>> experience or all these failed attempts.
>>
>> CouchDB’s strength is its replication protocol. We didn’t
>> rewrite CouchDB in JavaScript because we suddenly realised
>> there are a billion browsers, but PouchDB came along with
>> a compatible data model and replication engine so that the
>> two projects complement each other perfectly and anyone on
>> the CouchDB will tell you that PouchDB is one of the biggest
>> drivers of CouchDB adoption.
>>
>> How about we just re-run this strategy for IoT: build a
>> small thing that is useful for one use-case and make it work,
>> then make it more complicated to be useful for more use-cases.
>> At each point, make sure replication with CouchDB works.
>> That’s a winning strategy. We already know it.
>>
>> Best
>> Jan
>> —
>>
>>
>>>>   It is in direct competition with the proposal being worked on
>>>> this list for the FoundationDB backend swap. With the addition of 
>>>> MQTT,
>>>> it sounds like the entire replication protocol and methodology would
>>>> need to be revisited, as the semantic changes you're proposing would
>>>> break existing client replication.
>>> The HTTP replication protocol more or less remains the same in the 
>>> foreseeable future. A new MQTT replication strategy will be built 
>>> upon the existing method. The two will not work in parallel. Either 
>>> one of these will work per database.
>>>> Finally, the proposal to push into
>>>> the mobile space would directly compete with our sister project 
>>>> PouchDB,
>>>> who have put in tens of thousands of development hours as well.
>>> The community will evolve at some point. And bringing people from 
>>> sister project onto CouchDB will allow faster development. The 
>>> diagram in the proposal missed a part for Web Browser based CouchDB. 
>>> This missed part is an interface for JavaScript and CouchDB-Web 
>>> Browser. So, we will need some JavaScript developers too. And they 
>>> can help improve Fauxton.
>>>>   This all
>>>> adds up to a much bigger scoped project than CouchDB is today, and I
>>>> daresay may be bigger than I think even you realize.
>>> I do realize that I want CouchDB to be in a billion mobile and 
>>> embedded device by 2025. I understand this is a challenging scale. I 
>>> brought this here because I see how much we need a DB for a "Cluster 
>>> Of Unreliable Commodity Hardware". I assume proposed path will take 
>>> somewhere between 18-21 months to come to fruition for a team of 15 
>>> people working 40 hours/week.
>>>
>>>> With my PMC hat on, I have to ask:
>>>>
>>>> * Do you already have developers versed in these skills you can 
>>>> bring to
>>>>    the project (beyond yourself)? Are they ready to commit the 40+ 
>>>> hours
>>>>    a week each to making it a reality?
>>> No, I do not have a team in place for this.
>>>> * Do you have experience in building a distributed system of this 
>>>> scale,
>>>>    using the specific technologies you propose?
>>> I have been reading about distributed systems. I want to take up an 
>>> Open Source project which solves replication problem for devices 
>>> coming up with emerging technologies.CouchDB is the best fit as it 
>>> already solves theproblem of replication across remote devices.
>>>> * How do you plan to convince other developers of your approach
>>>>    specifically?
>>> What got us(you) here, won't get us(you) there! -- Marshall Goldsmith
>>>
>>> CouchDB led the way by being years ahead. This is just the same 
>>> thing happening again in a newer market. CouchDB is already great at 
>>> replication. What I am proposing is taking this simple-but-powerful 
>>> methodology a step further and building it for planetary scale 
>>> use-cases(idea derived from Lasp-Lang).Here are some ways with which 
>>> we drive more developers, users, and eyes.
>>>
>>> * Helping users realize that CouchDB lets them relax while building
>>>    applications for devices with any form factor.
>>> * Reaching out to the developers who have built their own solution for
>>>    replicating stuff from their device of any form factor to CouchDB
>>> * On-boarding developers who will become early adopter and test it out
>>>    on their IoT devices. Thus, proving an unmet market need.
>>> * Promoting offline-first strategy among mobile and embedded
>>>    developers will drive contributors from these communities.
>>> * Documenting comparisons between existing mobile and embedded
>>>    solutions which provide replication solutions like Realm, and 
>>> CouchBase.
>>>
>>>> * How do you intend to train up our existing developers on the new
>>>>    languages and technologies involved?
>>> If people are excited about the future they are building then this 
>>> is a smaller problem to tackle. People in this community when and if 
>>> they come to a consensus about the proposal then this can be tackled 
>>> by 'Each one, teach one' followed by Yamaha Motors. This is a buddy 
>>> system where people get new partners to tackle a problem/PR. They 
>>> share issues, their understanding of the codebase and language, etc. 
>>> with each other. As buddies rotate everyone gets on the same page 
>>> after a few cycles. I have found 3-pair buddy system works best in 
>>> software.But this may differ based on culture, language, timezone, 
>>> and availability.
>>>> * How do you perceive the advantages and disadvantages of your 
>>>> approach
>>>>    *specifically* vs. the FDB approach already outlined?
>>> Value addition (Horizontal) > >
>>> ----
>>> Proposal (Vertical) \/ \/
>>>
>>> Pros
>>>
>>>
>>> Cons
>>>
>>> FoundationDB
>>>
>>>
>>> * Improving what works for majority of existing users
>>> * Iterates CouchDB to a better form
>>> * Prospect of immediate consistency for ACID transactions
>>>
>>>
>>>
>>> * Losing some small and mid-sized developers
>>> * Fragments community
>>>
>>>
>>> Polyglot-unification
>>>
>>>
>>> * Growth by tapping newer prospects
>>> * Reduces fragmentation of user community and codebase
>>> * Reimagines CouchDB as if it was built in 2019
>>>
>>>
>>>
>>> * Tons of work
>>> * Uses RocksDB, overlooks FoundationDB migration
>>>
>>>
>>> Email with subject "CouchDb Rewrite/Fork" by 'Reddy B. 
>>> <reddy.b@live.fr>' has mentioned some other concerns. This proposal 
>>> introduces a new story for CouchDB. This proposal would require 
>>> using RocksDB instead of FoundationDB.
>>>
>>>> -Joan
>>>>
>>>> On 2019-07-09 10:28, Chintan Mishra wrote:
>>>>> Hello team!!
>>>>>
>>>>> Years of time and effort help move a product to the heights that 
>>>>> CouchDB
>>>>> has reached. And as a non-contributor, rather a very new CouchDB
>>>>> user(1.5 years) who failed to find some relevant emails, I came up 
>>>>> with
>>>>> a version of the future for CouchDB that I thought would help us 
>>>>> grow.
>>>>> But Jan and Robert helped me realize that it takes a village to 
>>>>> raise a
>>>>> child(CouchDB). So this is a proposal to find a middle ground from 
>>>>> where
>>>>> we are headed and where the market is going next. The proposal I 
>>>>> wrote
>>>>> was solely driven by what I have read over the years about the 
>>>>> growth of
>>>>> the product and the community. I have attached the file or if you 
>>>>> prefer
>>>>> reading in a browser, then click 
>>>>> here<https://gitlab.com/snippets/1873543>.
>>>>>
>>>>> It will roughly take 4-5 minutes of your time. A proposed 
>>>>> direction is
>>>>> to start an entirely new project. That is not what I desire. I 
>>>>> want to
>>>>> join the community behind CouchDB not build a new one using it. My 
>>>>> goal
>>>>> from this proposal is to generate leverage by creating early mover
>>>>> advantage and help grow the community.
>>>>>
>>>>> Thanking you.
>>>>>
>>>>> -- 
>>>>> Chintan Mishra
>>>>> Rebhu Computing
>>>>> Founder and CEO

Mime
View raw message