couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Vatamaniuc <vatam...@gmail.com>
Subject Re: Deno Query Server Demo
Date Fri, 15 May 2020 15:27:51 GMT
That's really cool, Jan!  Thanks for sharing.

Deno is based on v8 so it is interesting to see how it compares in
performance. Though, you're probably right that it could be just stdio
limited here.

Definitely like the simplicity and 500LoC part, too.

Cheers,
-Nick

On Fri, May 15, 2020 at 6:39 AM Jan Lehnardt <jan@apache.org> wrote:
>
> By way of not really scientifically benchmarking this, but for getting a feel for things,
I ran timing tests with three different document size classes:
>
> - 100 bytes
> - 512 bytes
> - 1024 bytes
>
> I’m using our trusty benchbulk.sh[1] script, so the majority of the data is a single
long value field with loads of `0`s in them. In no way representative, but quick to produce
1M docs.
>
> I’m running this on an 8 core Mac Mini with a very fast SSD, three runs per version.
This is on my regular work machine, so other things are going on, but the timings are surprisingly
stable to the second.
>
> I measured how long it takes to build an index over 1M documents in a q=2 database, to
leave enough cores for CouchDB and whatever else is going on on the box. At no point are CPU
or RAM maxed out, neither is the disk IO capacity.
>
> Repeated and interleaved runs should shake out any file system caching variability (which
I couldn’t observe anyway).
>
> 100 byte docs:
>
> couchjs is ~10% faster than deno, while using ~60% less CPU (40% vs. 70%), RAM usage
is rather erratic, springs from 20MB to 180MB and back periodically. deno RAM grows very slowly,
maxing out at 110MB at the end of the run, so I presume whatever long-generational GC isn’t
even kicking in yet.
>
> 512 byte docs:
>
> couchjs is ~5% faster than deno, same CPU and RAM profiles.
>
> 1024 byte docs:
>
> couchis is 20% slower(!) than deno, same CPU and RAM profiles.
>
> At 512 and 1024 byte docs, deno makes beam.smp work a little harder, about 5% CPU usage.
>
> All of the runs take between 1 and 2 minutes, so longer-running impacts aren’t showing
here.
>
> As you can see, this is very unscientific, but gives us an interesting direction.
>
> Depending on the workload, the deno query server *might* lead to faster indexing, on
larger docs, while making potentially better use of available CPU resources (or less euphemistically:
at the expense of using more CPU time), and with a lot more stable RAM profile.
>
> Given that I was able to put this together relatively quickly, and deno is very new,
I find this rather promising.
>
> In addition, since it is rather easy distributing this query server (install deno, download
the .js file, set an env var, done), this might be a nice community alternative to couchjs
for folks who see benefits.
>
> I’d also like to see us taking this to the deno folks to see if they have anything
up their sleeve in terms of speeding up stdio, or if there are tricks we can pull on the JS
side.
>
> * * *
>
> One more interesting point I didn’t mention last night: this is entirely in JS based
on an existing runtime. As opposed to couchjs, where we currently maintain a C and a C++ integration
layer that nobody likes touching.
>
> A pure-JS implementation, and my cleaned up (albeit less feature-full) ~500LoC of relatively
modern JS might lead to renewed innovation in the space. Who doesn’t like a well-defined
performance game :)
>
> Plus, it’d be interesting to see if the TypeScript compiler could add more optimisations
once the query server implementation is translated and type-annotated to be proper TypeScript.
>
> Best
> Jan
> —
> [1]: https://github.com/apache/couchdb/blob/master/test/bench/benchbulk.sh
>
> > On 14. May 2020, at 22:01, Jan Lehnardt <jan@apache.org> wrote:
> >
> > Hey all,
> >
> > I got nerd sniped by Joan this morning:
> >
> >    <+Wohali> hmmmmm. https://github.com/denoland/deno
> >    <+Wohali> i know i know another runtime but it's focused on security
> >
> > I wondered what it would take to make a couchjs variant based on deno. Turns out:
about a day if you cut some corners ;)
> >
> > One of the interesting aspects, as Joan notes, is its more-secure-by-default, so
I have some hopes that this might work out better than our ill-fated nodejs query server experiment
from a few years back.
> >
> > I started by hacking up a readily generated main.js, then ran `make` again, and
did it all again. Overall, it is ~30 LOC changes. Since there is no synchronous `readline()`
available and JS code can either by sync or async, we can’t make it so one source could
run in our couchjs or deno.
> >
> > So I went ahead and ripped all the basics out of our main.js and modernised things
a little bit along the way. The result is a main-deno.js that can run map/reduce/rereduce/filter/view_filter/validate_doc_update
functions (as validated by the query server spec).
> >
> >    https://gist.github.com/janl/c3139bc72efe663e35005d8864c4201f
> >
> > I intentionally left out the couchappy functions, as at least lists with the `getRow()`
function won’t be implementable without an API break. I also left out legacy compact with
esprima/escodegen to keep things more manageable. Oh and no lib/modules, given today’s JS
packaging tooling, it’s an easy choice to leave out.
> >
> > I haven’t done any sort of benchmarking, but I’d love for someone here to give
this a try. Here’s how to hack up `./dev/run` to add support for `deno` design docs:
> >
> >   https://gist.github.com/janl/01559f8617ef44afd5ceec39ec8389e8
> >
> > If you want to run this on a regular CouchDB setup, set up this env var before launching
CouchDB:
> >
> >    COUCHDB_QUERY_SERVER_DENO="deno run --allow-write /path/to/main-deno.js”
> >
> >    `--allow-write` is only required for the debug log (/tmp/deno-qs.log), but won’t
be required during operation, adding to the sandboxed nature of it all.
> >
> > And some proof of operation:
> >
> >   https://gist.github.com/janl/8636d469420a1fd2de481ae8f5780854
> >
> > It’d be nice to see how stable this is in practice and if there are any meaningful
performance / resource-usage differences. Any takers? I’ll answer any and all setup questions.
> >
> > Now I’m passing the nerd-snipe torch to Paul:
> >
> >    <+jan____> uh, and it is embeddable https://deno.land/manual/embedding_deno
> >
> > Best
> > Jan
> > —
>

Mime
View raw message