couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: Input validation and limits
Date Mon, 25 Mar 2013 09:48:45 GMT
I'll quibble a little over the notion that 'middleware' can occur
midway through request processing at the backend but, in general, yes.
My point was to take Jason's suggestion head on and attempt to achieve
consensus on what CouchDB should include versus exclude.

I should have started a new thread for that rather than immediately
forking this one. To answer the specific 'should we add X?' question
it seemed prudent to ask the general 'what features are appropriate
for couchdb?' question.

To cover your list, in brief, I'd say vhosting, rewriting, throttling,
ip checking are out and authentication and captcha are in, but that's
just my list.

This thread should either be renamed if we think the discussion is
about the general, or we should all stay on topic (myself included, of
course) and discuss the rate-limiting and captcha question.

I think rate-limiting is out of scope but that captcha is in scope
(because authentication in general is in scope). Is captcha technology
something that evolves quite quickly? Would support today be something
that our new quarterly updates could usefully keep pace with?

B.


On 25 March 2013 09:29, Benoit Chesneau <bchesneau@gmail.com> wrote:
> On Mon, Mar 25, 2013 at 9:11 AM, Robert Newson <rnewson@apache.org> wrote:
>> This is a great topic and one that goes to the heart of CouchDB's twin
>> roles as database and web server.
>>
>> Does CouchDB need to directly support every feature that a web server
>> ought to support? Or does CouchDB, by virtue of speaking HTTP, get to
>> stay lean, providing only what must be provided by an origin server in
>> the modern Web, and rely on other, hopefully solid and focused tools,
>> for everything else? Supporting CAPTCHA, in whatever form, seems quite
>> reasonable. It's an extension of our auth model in many respects and
>> something that can't easily be externalized.
>>
>> CouchDB's strength is that it's a database that speaks HTTP. In my
>> mind, it does that for one reason - to integrate with other things
>> that also speak HTTP. That obviously includes browsers but it also
>> includes load balancers, caching proxies, and so on.
>>
>> To the topic at hand I feel that rate limiting and IP blocking is
>> something best done externally, just as I feel about virtual hosting
>> and URL rewriting. Are our log files rich enough to power fail2ban
>> itself? Could they be enhanced if not? Would an iptables approach to
>> rate limiting be preferable? Can we, as the CouchDB developer
>> community, really support and maintain all the extra features if we
>> decided CouchDB-as-a-web-server means it ought to do all these things?
>> Will we work to make a clustered CouchDB work without external load
>> balancers or DNS failover services, to pick just two examples? Will we
>> add an http caching layer?
>>
>> I sound opinionated and entrenched when I ask too many questions in a
>> row, but they are sincere questions; it's not my intention to bludgeon
>> the proposal into the ground with them. I do want to explicitly reject
>> an accusation of "stop energy" before it's made, though. That phrase
>> is easily invoked though I do see that it's often been true in the
>> past, from myself and other developers.
>>
>> Adding this kind of statefulness seems inappropriate to me but it's
>> hard to argue the case when we have the URL rewriting and virtual
>> hosting built in. A separate conversation is looming about virtual
>> hosting because the Nebraska merge that brings clustering will not
>> bring virtual hosting with it; BigCouch has never supported native
>> virtual hosting, it's provided by HAProxy instead.
>>
>> I would love a broader discussion about where CouchDB ends and other
>> software begins. Is there a crisp line? I'd argue there could be,
>> though it's not crisp today. For me, as I've said, CouchDB is a
>> database that you talk to over HTTP. I'm for keeping that as lean as
>> possible; that's a big enough task already.
>>
>> B.
>>
>>
>
> I'm on your side imo. Since a long time I'm thinking we should rewrite
> the way couchdb handle authentication, vhosting & other HTTP related
> stuff.  What about refactor the HTTP level to use some kind of
> middleware systems to validate or transform the request and response:
>
> 1. Accept req
> 2. for middleware in request middleware: do something with req. if
> needed return a response
> 3. return response
> 4. for middleware in response middleware: do something with response
>
> So the authentication, vhosting, rewriting and possibly other
> middleware like throttling, ip checking, ... could be added easily or
> even removed when not needed. Something equivalent like mod_* in
> apache , so couchdb could offer some by default and let other vendors
> to ship the one they built for a specific case.
>
> Thoughts?
>
> - benoit

Mime
View raw message