couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ilya Khlopotov <iil...@ca.ibm.com>
Subject Re: On Plugins and Extensibility
Date Mon, 25 May 2015 12:57:36 GMT

Hi Paul,

>> Then any function exported by appname_funcs_mod (that wasn't a builtin
>> function, though maybe even if so?) could be invoked by an API like
>> such:
>>
>>    couch_epi:invoke(my_function_name, Arg1, Arg2, Arg3).
>>    couch_epi:apply(my_function_name, Args).
How you would implement ddoc_validator?
Validators in general are sitting in their own apps. They look at design
documents and in case they see the doc they understand they try to validate
it.
They throw an exception if there are any problems with the doc.
Currently couchdb calls `couch_index_server:validate`. The indexer could
export `validate` function.
This mechanism allows index specific validation of ddoc.
However this approach doesn't work if there is a need to validate something
in ddoc (presence of special key and check correctness of it's value)
across all indexers.
There is a need to chain validators. I.e. call all registered ddoc
validators and if there are no exceptions then call index specific one.

Another example when this functionality is desired is before_doc_update.
Imagine you want to stick extra field into user's profile. The only way to
do it currently is modification of couch_users_db:before_doc_update.
If you replace that with `couch_epi:invoke`. You still wouldn't be able to
achieve what you want since you would have to somehow call
couch_users_db:before_doc_update from your service provider. This situation
get's worse if multiple apps want to inject fields into user's profile.

BR,
ILYA





From:	Paul Davis <paul.joseph.davis@gmail.com>
To:	"dev@couchdb.apache.org" <dev@couchdb.apache.org>
Date:	2015/05/21 03:30 PM
Subject:	On Plugins and Extensibility



Hey everyone,

So I've been meaning to write this email for sometime but have been
kept busy with lots of super fun things that are super fun. Anyway, I
just wanted to get this out there to start getting feed back from
everyone involved.

Also, while this is called "Plugin Proposal" it shouldn't be confused
with the original couch_plugins use case. This is a lot lower level
(and may be used by something like couch_plugins if we go there again
in the future). Generally speaking this is just for cleaning up
internals and for people that want to run either minimal installs of
CouchDB or include the CouchDB in a larger Erlang application.

Plugin Proposal
===============

Background
----------

As we've grown the code base to include more and more applications
we're getting to the point where we've started adding various points
of extension in various ways. The best existing example is the
couch_stats application which loads stat/metric definitions from
applications. Henning Diedrich has some unmerged worked which looks to
follow a similar path for HTTP URL handlers. And Ilya Khlopotov has
some work for providing vendor specific hooks.

While each of these have some overlaps in their intended use case,
they also share the fact that they've all implemented their own idea
of extensibility in slightly different ways. That's not necessarily
bad, but I think that we could reduce a lot of complexity if we take a
step back and write a utility application that could then be used to
support each of these features so that we can have both the
extensibility as well as simplify the implementation of each
individual feature.

I'll start with a bit of background and then describe a general
approach as well as show some hopefully explicit example snippets of
how such a system might be used. Granted I haven't written out an
entire implementation of this so I may be off the mark in some places.

Bikeshed First
--------------

I have no idea what we'd call this. We could repurpose the
couch_plugins app conceivably or make something new. For the the
purposes of this document I'll call it couch_epi (for extensible
plugin interface) and hopefully that's terrible enough someone will
think of a better name for the actual application.

Requirements
------------

The three major requirements I've thought of are:

  # Automatically discoverable
  # Minimize apps that need to be started for tests
  # Support release upgrades

=== Automatically Discoverable ===

The biggest thing here is that I don't want to require a change to a
default.ini or similar to enable or disable specific functionality
when we can already signify that by having the application present or
not. This is both for groups that may want to add new Erlang
applications to a release as well as anyone that wants to run a
minimal/embedded Couch. These are both obviously advanced uses but I
think are important given the number of ways that CouchDB is being
used.

=== Minimize the apps that need to be started for tests ===

This one I think should be obvious to anyone that's been writing unit
tests lately. There are some often silly places where we require
applications be started just to run some tests. For example, places
where we may want to call a function that's been instrumented and
requires couch_stats to have knowledge about the stat.

=== Support release upgrades ===

This one is obviously fairly advanced and limited in its audience but
its something I'd like to at least consider in the design. This comes
into effect for things like couch_stats that use a text file for its
extension method. The issue is that the release upgrade mechanics
don't provide any sort of signal that is easily usable to indicate
when this file has changed during an upgrade so we're left polling the
file system which is less than optimal.

=== Other Things to Consider ===

A couple other things I'd like to keep in mind while discussing this
is that I'd also like to minimize the amount of boilerplate code and
coupling to make support this system. It should be hopefully a matter
of a few lines of code to enable the extensibility on either side of
the interface.

General Design
==============

The general themes that I see between all of our current extensions is
that they're all basically just bags of random bits of data that each
feature that is then used by each feature to define some behavior.

For instance, couch_stats is just a list of tuples with some names,
metric types, and descriptions. The dynamic chttpd handlers are just a
list of URL endpoints to an MFA. And the vendor specific plugins are
just a collection of functions that we'd like to invoke at specific
points.

Given that, the easiest approach I see is to implement a module that
can be placed into the supervision tree that connects the data to a
central repository (hosted by couch_epi) that can then be queried by
each feature.


Data Centric Examples
---------------------

For a concrete example, lets consider couch_stats. Any application
that wants to record metrics through the standard couch_stats app
could add an entry in its supervision tree with something like:

    {
        appname_stats,
        {couch_epi_data_source, start_link, [
            appname,
            {epi_key, {couch_stats, definitions}}
            {priv_file, "couch_stats.cfg"}
        ]},
        permanent,
        5000,
        worker,
        dynamic
    }

Then we'd just implement couch_epi_data_source once that would read
data from the specified file from the application's priv directory and
track it in an ets table.

When couch_stats wants to learn about all the installed data for its
stat definitions it would then just do something like:

    couch_epi:get({couch_stats, definitions})

Which would return a list of {appname, Data} tuples or something
similar. To ensure that couch_stats can react to changes in these
values, we would also provide an API like such:

     couch_epi:listen({couch_stats, definitions})

And any process that called that function would get a message whenever
the data for that key changed which it could use for its own nefarious
purposes.

For upgrades, instead of specifying {priv_file, FileName} we could
specify {mfa, {Mod, Fun, Args}} which would be invoked. Then we could
add a code_change function to that module that would allow us to call
something like couch_epi:reload() which would re-run the load for that
process's data source.

Function Centric Examples
-------------------------

Hopefully its obvious that given the data centric approach we could do
something quite similar for functions (given that an MFA is just a
small bit of data that we can use to invoke any function).

Though ovbiously we'd like to be able to have a bit more of a useful
API for clients so that we don't require all function based extensions
to have to reimplement that function invocation code.

The first thing that would change would be to provide a different type
of supervision tree entry to indicate this. Off the top of my head
this would look something like such:

    {
        appname_funcs,
        {couch_epi_functions, start_link, [
            appname,
            {module, appname_funcs_mod}
        ]},
        permanent,
        5000,
        worker,
        dynamic
    }

Then any function exported by appname_funcs_mod (that wasn't a builtin
function, though maybe even if so?) could be invoked by an API like
such:

    couch_epi:invoke(my_function_name, Arg1, Arg2, Arg3).
    couch_epi:apply(my_function_name, Args).

We could also add various helper utilities or an Options parameter
that would handle things like ignoring all exceptions, letting
exceptions bubble or other such things that any invocation point might
desire.


More Details
------------

The final app would have something like such:

    couch_epi.app.src
    couch_epi.erl - API for features accessing extension data
    couch_epi_data_source.erl - Module that is inserted into
application supevision trees to provide data sources for extension
points
    couch_epi_functions.erl - Module that is inserted into application
supervision trees to provide function invocations
    couch_epi_server.erl - Handles registration requests from
couch_epi_data_soruce and couch_epi_functions and stores that
information in an ets table. Also has the list of pids registered to
listen for updates and notifies them.
    couch_epi_sup.erl - probably a single child for couch_epi_server.erl
    couch_epi_util.erl - The usual collection of functions that don't
quite fit anywhere else (if needed).

This should be a rather simple application in general. One side wants
to publish some data, and the other wants to use it and possibly be
notified when a particular bit of data is changed. And then we'll also
provide some API sugar around invoking functions.

Conclusion
==========

Hopefully that all makes sense to at least some people in parts. I've
been thinking about this on and off over a few weeks so my thoughts
are a bit jumbled as I try and remember the salient points. I figured
I'd just try and start getting them out there so that other people can
comment on things and or let me know that I've forgotten something
obvious that cripples this entire approach.



Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message