couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Grman <peter.gr...@gmail.com>
Subject Re: Allow user-defined views
Date Sat, 29 Nov 2014 11:50:58 GMT
Hi Justin,

for the early stage I'm only going to use it myself and invite some friends
who have companies, so that shouldn't be a problem. Later on, it probably
won't be possible that easily, because of the time it would take until the
new query is submitted.

I'd rather have a system which allows some queries, which are most probably
safe, and only tells me to check those which look suspicious.

For instance, I could check the amount of space the query is consuming, as
one factor. This could even help the customers improve their queries. For
instance I just found out last week about the feature "*include_docs*"
which would also help to keep the views small.

Unfortunately, so far I didn't find anything which would tell me when the
view was last time executed, or for how long it ran. Is there such a
property? Otherwise I could try to run the view functions manually to see
how long it takes them to execute.

Thank, Cheers
Peter

On Fri Nov 28 2014 at 7:50:38 AM <justin@lisol.co.uk> wrote:

> Hi Peter,
>
> An interesting concept ...
>
> This may sound simplistic, but is it viable for your application to
> initially have a process where new queries written are vetted by a human
> before they are run ? The advantage of this is two-fold, namely:
>         i) you'll be able to move on a prove your concept quickly
>         ii) while doing this, you may learn enough (and things may change
> enough) for you to automate the vetting process
>
> Thanks,
> Justin
>
> -----Original Message-----
> From: Peter Grman [mailto:peter.grman@gmail.com]
> Sent: 28 November 2014 02:12
> To: user@couchdb.apache.org
> Subject: Re: Allow user-defined views
>
> No, I don't. The program should be for analysing logs (collected by
> fluentd) - should be open source and on github, however there isn't much
> done yet: https://github.com/logTank/
>
> The index rebuilding shouldn't be a problem as CouchDB will be only used
> for general stats and the user actually won't see the up to date data, but
> always with a delay - another advantage of CouchDB, I can read the queries
> without bothering the system, and once the data is outdated, I can update
> the index. At least so far the theory, I'll need to run some performance
> tests if that actually works, once I'll have a MVP. The other option is to
> use MongoDB for ad-hoc queries, but I was thinking that CouchDB will be
> more efficient as storage is so cheap.
>
> As I'm learning every time I look up info about CouchDB something new, and
> something becomes more clear, I'm also glad about feedback on the idea in
> general, how I want to use CouchDB.
>
> However I'd be also very happy if I could somehow solve the problem with
> the possible DoS attacks :). Maybe there is something in CouchDB or evalcx
> which I can configure - maximal runtime of a map/reduce function?
> (shouldn't be more than 1ms). Or there are some logged data by CouchDB
> about the resources required by views (CPU Time + HDD Space)?
>
> Cheers
> Peter
>
> On Fri Nov 28 2014 at 2:54:51 AM Alexander Gabriel <alex@barbalex.ch>
> wrote:
>
> > sorry for being off-topic
> > Alex
> >
> >
> > 2014-11-28 2:52 GMT+01:00 Alexander Gabriel <alex@barbalex.ch>:
> >
> > > sounds like a very interesting application
> > >
> > > seems like you dont care if the user has to wait for an index to be
> > > built when the user creates a query
> > >
> > > Alex
> > >
> > >
> > > 2014-11-28 2:23 GMT+01:00 Peter Grman <peter.grman@gmail.com>:
> > >
> > >> Hi Alex,
> > >>
> > >> Yes, the users would be able to import different sets of data,
> > >> which
> > isn't
> > >> relational, and use the platform to analyse it. The analysed data
> > >> would
> > be
> > >> in 99% of the cases append only (+ removing old data) and the data
> > >> can
> > be
> > >> defined by the user, as well as be hierarchical.
> > >>
> > >> When I thought about the system in the beginning, CouchDB seemed
> > >> like an awesome choice as there would be only a couple of well
> > >> defined queries
> > and
> > >> storage is generally cheap, I thought that CouchDB views and their
> > caching
> > >> are what I'm looking for.
> > >>
> > >> The problem is again only with people who want to trick the system.
> > >> I would be also happy with a solution which would detect bad views
> > >> ones they
> > have
> > >> been deployed (uses too much space, takes too long to compute) and
> > >> deactivates and marks them for me to check. This way I could check
> > >> those few people who try a DoS attack and ban them from the service.
> > >>
> > >> The additional main problem was, if it is really impossible to get
> > >> data from a different database inside the view and if the user
> > >> won't be able
> > to
> > >> access the underlying system, ..., or if it is just very difficult
> > >> => possible, if someone wants to do it they'll find a way. But
> > >> after
> > reading
> > >> more and understanding more, how the views are executed using
> > >> evalcx I think the other problems aren't a big concern for me
> > >> anymore, is that correct?.
> > >>
> > >> Although I've found in the code "if possible, use evalcx (not
> > >> always available)" - how can I check that evalcx is available on my
> > >> system? Or
> > is
> > >> it just a note for older distributions, nothing to be concerned
> > >> about anymore?
> > >>
> > >> Thank you
> > >>
> > >> Cheers
> > >> Peter
> > >>
> > >> On Fri Nov 28 2014 at 1:37:57 AM Alexander Gabriel
> > >> <alex@barbalex.ch>
> > >> wrote:
> > >>
> > >> > Hi Peter
> > >> >
> > >> > Will the users create their own datastructures too?
> > >> > If not this sounds like sql on relational tables might be a
> > >> > better
> > tool
> > >> for
> > >> > the problem.
> > >> > It seems to me you're hitting exactly the weak point of most
> > >> > nosql solutions.
> > >> >
> > >> > Alex
> > >> >
> > >> >
> > >> > 2014-11-28 0:49 GMT+01:00 Peter Grman <peter.grman@gmail.com>:
> > >> >
> > >> > > Hi,
> > >> > >
> > >> > > this might sound like a terrible idea to someone who knows
> > >> > > CouchDB,
> > >> and
> > >> > if
> > >> > > that's the case, please just take a minute or two, to explain
> > >> > > why, otherwise, if the idea isn't so crazy after all, I hope
> > >> > > I'll get
> > some
> > >> > > solutions to my problem:
> > >> > >
> > >> > > I'm thinking of creating a platform based on CouchDB, where
> > >> > > each set
> > >> of
> > >> > > users (group, customer, ...) would get their own CouchDB
> > >> > > Database,
> > to
> > >> > store
> > >> > > and query data. I've heard in a podcast, roughly a year ago,
> > >> > > that
> > >> this is
> > >> > > how CouchDB was meant to be - many smaller databases.
> > >> > >
> > >> > > To query the data, I want to allow them, to define their own
> > >> > > custom queries. Now I could (and want to) create a form which
> > >> > > allows to
> > >> build a
> > >> > > query and translates it to a JS view, but I was thinking about
> > >> > > additionally, on top of that, allowing them to define their
> > >> > > custom
> > >> views
> > >> > > directly in JS. They would basically be allowed to define their
> > custom
> > >> > > Map/Reduce functions.
> > >> > >
> > >> > > There is a lot which can go wrong with this the worst ones I
> > >> > > came up
> > >> > with:
> > >> > > - DoS attack with endless loops inside the function
> > >> > > - DoS attack by emitting too much data (potentially in a loop
> > >> > > again)
> > >> > >
> > >> > > As far as I've understood, it's not possible to access other
> > Databases
> > >> > from
> > >> > > within the view, is this understanding of mine correct?
> > >> > >
> > >> > > Is it possible to access the filesystem or network services in
> > >> > > any
> > way
> > >> > from
> > >> > > the CouchDB view or is the JavaScript engine, which is running
> > >> > > the
> > >> code,
> > >> > > limiting enough?
> > >> > >
> > >> > > Are there any other things which could go wrong? - or did
> > >> > > actually
> > >> > somebody
> > >> > > already use CouchDB like this, and it's perfectly normal?
> > >> > >
> > >> > > Is there any way I could prevent the problem with endless loops
> > >> > > and
> > >> data
> > >> > > emitting from happening? - I can run JSLint, which maybe will
> > >> > > detect
> > >> an
> > >> > > endless loop, but that won't help against a loop with a million
> > >> > iterations,
> > >> > > which will be called for every item inside CouchDB - still
> > >> > > quite
> > >> endless.
> > >> > >
> > >> > > Thank you for your help!
> > >> > >
> > >> > > Cheers,
> > >> > > Peter
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message