couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <jus...@lisol.co.uk>
Subject RE: Allow user-defined views
Date Fri, 28 Nov 2014 06:50:10 GMT
Hi Peter,

An interesting concept ...

This may sound simplistic, but is it viable for your application to initially have a process
where new queries written are vetted by a human before they are run ? The advantage of this
is two-fold, namely:
	i) you'll be able to move on a prove your concept quickly
	ii) while doing this, you may learn enough (and things may change enough) for you to automate
the vetting process

Thanks,
Justin

-----Original Message-----
From: Peter Grman [mailto:peter.grman@gmail.com] 
Sent: 28 November 2014 02:12
To: user@couchdb.apache.org
Subject: Re: Allow user-defined views

No, I don't. The program should be for analysing logs (collected by
fluentd) - should be open source and on github, however there isn't much done yet: https://github.com/logTank/

The index rebuilding shouldn't be a problem as CouchDB will be only used for general stats
and the user actually won't see the up to date data, but always with a delay - another advantage
of CouchDB, I can read the queries without bothering the system, and once the data is outdated,
I can update the index. At least so far the theory, I'll need to run some performance tests
if that actually works, once I'll have a MVP. The other option is to use MongoDB for ad-hoc
queries, but I was thinking that CouchDB will be more efficient as storage is so cheap.

As I'm learning every time I look up info about CouchDB something new, and something becomes
more clear, I'm also glad about feedback on the idea in general, how I want to use CouchDB.

However I'd be also very happy if I could somehow solve the problem with the possible DoS
attacks :). Maybe there is something in CouchDB or evalcx which I can configure - maximal
runtime of a map/reduce function?
(shouldn't be more than 1ms). Or there are some logged data by CouchDB about the resources
required by views (CPU Time + HDD Space)?

Cheers
Peter

On Fri Nov 28 2014 at 2:54:51 AM Alexander Gabriel <alex@barbalex.ch> wrote:

> sorry for being off-topic
> Alex
>
>
> 2014-11-28 2:52 GMT+01:00 Alexander Gabriel <alex@barbalex.ch>:
>
> > sounds like a very interesting application
> >
> > seems like you dont care if the user has to wait for an index to be 
> > built when the user creates a query
> >
> > Alex
> >
> >
> > 2014-11-28 2:23 GMT+01:00 Peter Grman <peter.grman@gmail.com>:
> >
> >> Hi Alex,
> >>
> >> Yes, the users would be able to import different sets of data, 
> >> which
> isn't
> >> relational, and use the platform to analyse it. The analysed data 
> >> would
> be
> >> in 99% of the cases append only (+ removing old data) and the data 
> >> can
> be
> >> defined by the user, as well as be hierarchical.
> >>
> >> When I thought about the system in the beginning, CouchDB seemed 
> >> like an awesome choice as there would be only a couple of well 
> >> defined queries
> and
> >> storage is generally cheap, I thought that CouchDB views and their
> caching
> >> are what I'm looking for.
> >>
> >> The problem is again only with people who want to trick the system. 
> >> I would be also happy with a solution which would detect bad views 
> >> ones they
> have
> >> been deployed (uses too much space, takes too long to compute) and 
> >> deactivates and marks them for me to check. This way I could check 
> >> those few people who try a DoS attack and ban them from the service.
> >>
> >> The additional main problem was, if it is really impossible to get 
> >> data from a different database inside the view and if the user 
> >> won't be able
> to
> >> access the underlying system, ..., or if it is just very difficult 
> >> => possible, if someone wants to do it they'll find a way. But 
> >> after
> reading
> >> more and understanding more, how the views are executed using  
> >> evalcx I think the other problems aren't a big concern for me 
> >> anymore, is that correct?.
> >>
> >> Although I've found in the code "if possible, use evalcx (not 
> >> always available)" - how can I check that evalcx is available on my 
> >> system? Or
> is
> >> it just a note for older distributions, nothing to be concerned 
> >> about anymore?
> >>
> >> Thank you
> >>
> >> Cheers
> >> Peter
> >>
> >> On Fri Nov 28 2014 at 1:37:57 AM Alexander Gabriel 
> >> <alex@barbalex.ch>
> >> wrote:
> >>
> >> > Hi Peter
> >> >
> >> > Will the users create their own datastructures too?
> >> > If not this sounds like sql on relational tables might be a 
> >> > better
> tool
> >> for
> >> > the problem.
> >> > It seems to me you're hitting exactly the weak point of most 
> >> > nosql solutions.
> >> >
> >> > Alex
> >> >
> >> >
> >> > 2014-11-28 0:49 GMT+01:00 Peter Grman <peter.grman@gmail.com>:
> >> >
> >> > > Hi,
> >> > >
> >> > > this might sound like a terrible idea to someone who knows 
> >> > > CouchDB,
> >> and
> >> > if
> >> > > that's the case, please just take a minute or two, to explain 
> >> > > why, otherwise, if the idea isn't so crazy after all, I hope 
> >> > > I'll get
> some
> >> > > solutions to my problem:
> >> > >
> >> > > I'm thinking of creating a platform based on CouchDB, where 
> >> > > each set
> >> of
> >> > > users (group, customer, ...) would get their own CouchDB 
> >> > > Database,
> to
> >> > store
> >> > > and query data. I've heard in a podcast, roughly a year ago, 
> >> > > that
> >> this is
> >> > > how CouchDB was meant to be - many smaller databases.
> >> > >
> >> > > To query the data, I want to allow them, to define their own 
> >> > > custom queries. Now I could (and want to) create a form which 
> >> > > allows to
> >> build a
> >> > > query and translates it to a JS view, but I was thinking about 
> >> > > additionally, on top of that, allowing them to define their 
> >> > > custom
> >> views
> >> > > directly in JS. They would basically be allowed to define their
> custom
> >> > > Map/Reduce functions.
> >> > >
> >> > > There is a lot which can go wrong with this the worst ones I 
> >> > > came up
> >> > with:
> >> > > - DoS attack with endless loops inside the function
> >> > > - DoS attack by emitting too much data (potentially in a loop 
> >> > > again)
> >> > >
> >> > > As far as I've understood, it's not possible to access other
> Databases
> >> > from
> >> > > within the view, is this understanding of mine correct?
> >> > >
> >> > > Is it possible to access the filesystem or network services in 
> >> > > any
> way
> >> > from
> >> > > the CouchDB view or is the JavaScript engine, which is running 
> >> > > the
> >> code,
> >> > > limiting enough?
> >> > >
> >> > > Are there any other things which could go wrong? - or did 
> >> > > actually
> >> > somebody
> >> > > already use CouchDB like this, and it's perfectly normal?
> >> > >
> >> > > Is there any way I could prevent the problem with endless loops 
> >> > > and
> >> data
> >> > > emitting from happening? - I can run JSLint, which maybe will 
> >> > > detect
> >> an
> >> > > endless loop, but that won't help against a loop with a million
> >> > iterations,
> >> > > which will be called for every item inside CouchDB - still 
> >> > > quite
> >> endless.
> >> > >
> >> > > Thank you for your help!
> >> > >
> >> > > Cheers,
> >> > > Peter
> >> > >
> >> >
> >>
> >
> >
>


Mime
View raw message