couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: On Plugins and Extensibility
Date Fri, 22 May 2015 18:50:54 GMT
priv/*.cfg files if used aren't intended to be configured by users.
They're not part of the external surface area a normal user/sysadmin
should care about. For instance, a recent request from dynamic URL
handlers was to be able to disable specific URL endpoints. The chttpd
app would be responsible for looking at the config to change its use
of the data registered for this extension. For instance, it could
provide a blacklist config setting where users could tell it to
disable some specific handlers or even request specific overrides.
Regardless, its separate from the "this is the data" to "this is how
I'll use the data". Also for the updates, I kinda hedged on that. I
think if we want to update during a release upgrade then they should
use a module based approach which fits into the existing reload
machinery rather than trying to do polling. But I also wanted to show
that the proposed outline is extensible enough that we could support a
number of different approaches rather easily.

For ets vs. mochiglobal its hard to say. I was initially thinking that
we'd have each feature do that themselves. Ie, chttpd handlers do
that. For that feature the update to couch_epi would just be where it
gets its data rather than scanning the file system. couch_stats would
be the same (though I think its an ets thing internally at the
moment). That said, it might be useful to provide some sort of utility
API that provides that ability to any feature that needs it. And I
could see doing something around it for function invocations as well.
Also, even if we did provide this I'd expect it to be mostly
maintained by the feature. As it seems like one of those things where
we'd end up having to over engineer things in couch_epi to support all
the slightly different use cases.

On Thu, May 21, 2015 at 6:18 PM, Russell Branca <chewbranca@apache.org> wrote:
> Hey Paul,
>
>
> Thanks for the great writeup!
>
> Couple of questions:
>
> How do priv/*.cfg files work with dynamic config updates? Seems like we'll
> lose the ability to write changes back to the config file, like we have
> with default.ini. Although I've been wondering for a while if allowing
> config:set/* to persist data is an anti-pattern compared to forcing
> persistent updates to be made directly in config files with the hopes they
> are done so in a VCS. Also, if we switch to priv/*.cfg how will users
> update those files? seems like we'll have to poll the file system for every
> one of those config files, or are you thinking that those files would only
> be updated as part of releases?
>
> Have you thought at all about using the mochiglobal.erl approach to
> regenerate a module with all the plugin dispatch compiled as functions?
> Seems like using shared ETS tables could become a source of contention for
> lots of concurrent requests. Proper benchmarks would provide better
> understanding as to whether this is actually an issue.
>
>
> -Russell
>
> On Thu, May 21, 2015 at 3:29 PM, Paul Davis <paul.joseph.davis@gmail.com>
> wrote:
>
>> Hey everyone,
>>
>> So I've been meaning to write this email for sometime but have been
>> kept busy with lots of super fun things that are super fun. Anyway, I
>> just wanted to get this out there to start getting feed back from
>> everyone involved.
>>
>> Also, while this is called "Plugin Proposal" it shouldn't be confused
>> with the original couch_plugins use case. This is a lot lower level
>> (and may be used by something like couch_plugins if we go there again
>> in the future). Generally speaking this is just for cleaning up
>> internals and for people that want to run either minimal installs of
>> CouchDB or include the CouchDB in a larger Erlang application.
>>
>> Plugin Proposal
>> ===============
>>
>> Background
>> ----------
>>
>> As we've grown the code base to include more and more applications
>> we're getting to the point where we've started adding various points
>> of extension in various ways. The best existing example is the
>> couch_stats application which loads stat/metric definitions from
>> applications. Henning Diedrich has some unmerged worked which looks to
>> follow a similar path for HTTP URL handlers. And Ilya Khlopotov has
>> some work for providing vendor specific hooks.
>>
>> While each of these have some overlaps in their intended use case,
>> they also share the fact that they've all implemented their own idea
>> of extensibility in slightly different ways. That's not necessarily
>> bad, but I think that we could reduce a lot of complexity if we take a
>> step back and write a utility application that could then be used to
>> support each of these features so that we can have both the
>> extensibility as well as simplify the implementation of each
>> individual feature.
>>
>> I'll start with a bit of background and then describe a general
>> approach as well as show some hopefully explicit example snippets of
>> how such a system might be used. Granted I haven't written out an
>> entire implementation of this so I may be off the mark in some places.
>>
>> Bikeshed First
>> --------------
>>
>> I have no idea what we'd call this. We could repurpose the
>> couch_plugins app conceivably or make something new. For the the
>> purposes of this document I'll call it couch_epi (for extensible
>> plugin interface) and hopefully that's terrible enough someone will
>> think of a better name for the actual application.
>>
>> Requirements
>> ------------
>>
>> The three major requirements I've thought of are:
>>
>>   # Automatically discoverable
>>   # Minimize apps that need to be started for tests
>>   # Support release upgrades
>>
>> === Automatically Discoverable ===
>>
>> The biggest thing here is that I don't want to require a change to a
>> default.ini or similar to enable or disable specific functionality
>> when we can already signify that by having the application present or
>> not. This is both for groups that may want to add new Erlang
>> applications to a release as well as anyone that wants to run a
>> minimal/embedded Couch. These are both obviously advanced uses but I
>> think are important given the number of ways that CouchDB is being
>> used.
>>
>> === Minimize the apps that need to be started for tests ===
>>
>> This one I think should be obvious to anyone that's been writing unit
>> tests lately. There are some often silly places where we require
>> applications be started just to run some tests. For example, places
>> where we may want to call a function that's been instrumented and
>> requires couch_stats to have knowledge about the stat.
>>
>> === Support release upgrades ===
>>
>> This one is obviously fairly advanced and limited in its audience but
>> its something I'd like to at least consider in the design. This comes
>> into effect for things like couch_stats that use a text file for its
>> extension method. The issue is that the release upgrade mechanics
>> don't provide any sort of signal that is easily usable to indicate
>> when this file has changed during an upgrade so we're left polling the
>> file system which is less than optimal.
>>
>> === Other Things to Consider ===
>>
>> A couple other things I'd like to keep in mind while discussing this
>> is that I'd also like to minimize the amount of boilerplate code and
>> coupling to make support this system. It should be hopefully a matter
>> of a few lines of code to enable the extensibility on either side of
>> the interface.
>>
>> General Design
>> ==============
>>
>> The general themes that I see between all of our current extensions is
>> that they're all basically just bags of random bits of data that each
>> feature that is then used by each feature to define some behavior.
>>
>> For instance, couch_stats is just a list of tuples with some names,
>> metric types, and descriptions. The dynamic chttpd handlers are just a
>> list of URL endpoints to an MFA. And the vendor specific plugins are
>> just a collection of functions that we'd like to invoke at specific
>> points.
>>
>> Given that, the easiest approach I see is to implement a module that
>> can be placed into the supervision tree that connects the data to a
>> central repository (hosted by couch_epi) that can then be queried by
>> each feature.
>>
>>
>> Data Centric Examples
>> ---------------------
>>
>> For a concrete example, lets consider couch_stats. Any application
>> that wants to record metrics through the standard couch_stats app
>> could add an entry in its supervision tree with something like:
>>
>>     {
>>         appname_stats,
>>         {couch_epi_data_source, start_link, [
>>             appname,
>>             {epi_key, {couch_stats, definitions}}
>>             {priv_file, "couch_stats.cfg"}
>>         ]},
>>         permanent,
>>         5000,
>>         worker,
>>         dynamic
>>     }
>>
>> Then we'd just implement couch_epi_data_source once that would read
>> data from the specified file from the application's priv directory and
>> track it in an ets table.
>>
>> When couch_stats wants to learn about all the installed data for its
>> stat definitions it would then just do something like:
>>
>>     couch_epi:get({couch_stats, definitions})
>>
>> Which would return a list of {appname, Data} tuples or something
>> similar. To ensure that couch_stats can react to changes in these
>> values, we would also provide an API like such:
>>
>>      couch_epi:listen({couch_stats, definitions})
>>
>> And any process that called that function would get a message whenever
>> the data for that key changed which it could use for its own nefarious
>> purposes.
>>
>> For upgrades, instead of specifying {priv_file, FileName} we could
>> specify {mfa, {Mod, Fun, Args}} which would be invoked. Then we could
>> add a code_change function to that module that would allow us to call
>> something like couch_epi:reload() which would re-run the load for that
>> process's data source.
>>
>> Function Centric Examples
>> -------------------------
>>
>> Hopefully its obvious that given the data centric approach we could do
>> something quite similar for functions (given that an MFA is just a
>> small bit of data that we can use to invoke any function).
>>
>> Though ovbiously we'd like to be able to have a bit more of a useful
>> API for clients so that we don't require all function based extensions
>> to have to reimplement that function invocation code.
>>
>> The first thing that would change would be to provide a different type
>> of supervision tree entry to indicate this. Off the top of my head
>> this would look something like such:
>>
>>     {
>>         appname_funcs,
>>         {couch_epi_functions, start_link, [
>>             appname,
>>             {module, appname_funcs_mod}
>>         ]},
>>         permanent,
>>         5000,
>>         worker,
>>         dynamic
>>     }
>>
>> Then any function exported by appname_funcs_mod (that wasn't a builtin
>> function, though maybe even if so?) could be invoked by an API like
>> such:
>>
>>     couch_epi:invoke(my_function_name, Arg1, Arg2, Arg3).
>>     couch_epi:apply(my_function_name, Args).
>>
>> We could also add various helper utilities or an Options parameter
>> that would handle things like ignoring all exceptions, letting
>> exceptions bubble or other such things that any invocation point might
>> desire.
>>
>>
>> More Details
>> ------------
>>
>> The final app would have something like such:
>>
>>     couch_epi.app.src
>>     couch_epi.erl - API for features accessing extension data
>>     couch_epi_data_source.erl - Module that is inserted into
>> application supevision trees to provide data sources for extension
>> points
>>     couch_epi_functions.erl - Module that is inserted into application
>> supervision trees to provide function invocations
>>     couch_epi_server.erl - Handles registration requests from
>> couch_epi_data_soruce and couch_epi_functions and stores that
>> information in an ets table. Also has the list of pids registered to
>> listen for updates and notifies them.
>>     couch_epi_sup.erl - probably a single child for couch_epi_server.erl
>>     couch_epi_util.erl - The usual collection of functions that don't
>> quite fit anywhere else (if needed).
>>
>> This should be a rather simple application in general. One side wants
>> to publish some data, and the other wants to use it and possibly be
>> notified when a particular bit of data is changed. And then we'll also
>> provide some API sugar around invoking functions.
>>
>> Conclusion
>> ==========
>>
>> Hopefully that all makes sense to at least some people in parts. I've
>> been thinking about this on and off over a few weeks so my thoughts
>> are a bit jumbled as I try and remember the salient points. I figured
>> I'd just try and start getting them out there so that other people can
>> comment on things and or let me know that I've forgotten something
>> obvious that cripples this entire approach.
>>

Mime
View raw message