couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian McQueen <mcqueenor...@gmail.com>
Subject Re: [ANN] jqouch, a jq-based view server
Date Mon, 30 Mar 2015 16:44:16 GMT
Nice! Jq is a great tool, and putting it in like that is quite nice.

On Sun, Mar 29, 2015 at 6:14 AM, Matthieu Rakotojaona <
matthieu.rakotojaona@gmail.com> wrote:

> Hey Alexander,
>
> I don't think I'll re-implement jq in pure Golang. This could be an
> interesting exercise in lexing/parsing, I'm not sure I'll make it until
> the end.
>
> Using the C API though is the next step !
>
> Excerpts from Alexander Shorin's message of 2015-03-29 02:17:43 +0300:
> > I knew that someone will make jq query server and here it is. Nice
> > work, Matthieu!
> >
> > Do you plan to implement jq in Golang? That will significantly improve
> > your query server and will allow others to embed jq into their apps.
> > --
> > ,,,^..^,,,
> >
> >
> > On Sat, Mar 28, 2015 at 6:12 PM, Matthieu Rakotojaona
> > <matthieu.rakotojaona@gmail.com> wrote:
> > > Hello guys,
> > >
> > > I'd like to announce a jq-based view server for couchdb. It's extremely
> > > rudimentary, but works as a proof of concept of what can be achieved:
> > >
> > > https://github.com/rakoo/jqouch
> > >
> > > A bit of background: jq is a cli tool to extract and render information
> > > from any json you give it, with a custom but powerful syntax:
> > >
> > > $ curl localhost:5984 | jq '.vendor .version'
> > > "1.6.1"
> > >
> > > $ curl localhost:5984/mydb | jq '.disk_size - .data_size'
> > > 80892224
> > >
> > > Looks like I'd better compact !
> > >
> > > If you're dabbling with json and not using it already, I encourage you
> > > to check it out.
> > >
> > > Basically jq is invoked with a filter (that's the '.vendor .version'
> > > from the example above); you then feed jq with a JSON document in
> stdin,
> > > and it gives you all matches and transformations on stdout.  jqouch
> > > works by taking the function given in "add_fun" and spawning an
> external
> > > process with this fun as a filter, and forwarding documents in
> "map_doc"
> > > to it. All output from jq is then sent back to CouchDB through jqouch
> > > (jq processes are not killed after each doc, they stay alive as long as
> > > the stdin is not closed, which jqouch never does until it dies)
> > >
> > > I have included some example in the repo, here they are. I'm using some
> > > examples from a dump of... I don't know exactly what, but a sample is
> > > here:
> > >
> > > https://github.com/rakoo/jqouch/blob/master/sample.json
> > >
> > > taken from http://parltrack.euwiki.org/dumps/eurlex.json.xz. That's
> > > 22925 documents. I made some benchmarks on CouchDB 1.6:
> > >
> > > Here's a really simple view in js:
> > >
> > >     function(doc) {
> > >       emit(doc.title, null)
> > >     }
> > >
> > > it maps all docs in ~ 35s
> > >
> > > And the equivalent in jq:
> > >
> > >     [ [.title, null] ]
> > >
> > > it maps all docs in ~ 19s
> > >
> > > Each map function emits a list of kv pairs, there's no more emit();
> it's
> > > actually the format of what a query server has to return for each
> > > mapping function. It may not be ideal, but it works.
> > >
> > > Here's an other, more "useful" set of view:
> > >
> > >   function(doc) {
> > >     for (var i = 0; i < doc.dates.length; i++) {
> > >       emit([doc.dates[i].type, doc.dates[i].date], null)
> > >     }
> > >   }
> > >
> > > runs in ~ 32s
> > >
> > >     [ .dates[] | [[.type, .date], null] ]
> > >
> > > runs in ~ 19s
> > >
> > >
> > >
> > >
> > > There are a few things we can say:
> > >
> > > * For all 4 pairs of example views (see repo), jq is constantly almost
> > >   twice as fast as the equivalent js. Moreover the couchjs process is
> > >   always eating a large part of my CPU when running, whereas the jq
> > >   process is never over 30%. This indicates some overhead is spent on
> > >   passing documents betweer processes, which I'm going to investigate
> > >   with the jq C API.
> > >
> > > * jq views can be hard to understand and write, but they can be tested
> > >   through the cli jq tool directly, or even online with jqplay
> > >   (https://jqplay.org/)
> > >
> > > * using jq doesn't (AFAIK) allow one to output non-deterministic
> values,
> > >   by default
> > >
> > > * jq is "sandboxed" in that it can't do anything other than transform
> > >   documents, contrary to standard languages
> > >
> > > * jq filters are in my opininion very clear on what they do, such that
> a
> > >   one-line filter can be enough in most cases
> > >
> > > Of course, it's not all rainbows and unicorns:
> > >
> > > * there are still some quirks in the jq views, they can output
> something
> > >  like [null, null] when they should not return anything because the
> > >  view doesn't apply to the doc.
> > >
> > > * jqouch currently doesn't understand anything other than "reset",
> > >   "add_fun" and "map_doc"
> > >
> > > * I don't see the jq language as being enough for more generic
> functions
> > >   such as show and list, but who knows
> > >
> > > Anyway, there may be some value in using jq to define basic views, the
> > > ones that just index a document on some value and don't do much more.
> As
> > > a non-serious CouchDB user I've never had to use really fancy views.
> > >
> > > Thoughts ?
>



-- 
the news wire of the 21st century - twitchy.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message