couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Lehnardt <...@apache.org>
Subject Re: Deno Query Server Demo
Date Fri, 15 May 2020 10:39:08 GMT
By way of not really scientifically benchmarking this, but for getting a feel for things, I
ran timing tests with three different document size classes:

- 100 bytes
- 512 bytes
- 1024 bytes

I’m using our trusty benchbulk.sh[1] script, so the majority of the data is a single long
value field with loads of `0`s in them. In no way representative, but quick to produce 1M
docs.

I’m running this on an 8 core Mac Mini with a very fast SSD, three runs per version. This
is on my regular work machine, so other things are going on, but the timings are surprisingly
stable to the second.

I measured how long it takes to build an index over 1M documents in a q=2 database, to leave
enough cores for CouchDB and whatever else is going on on the box. At no point are CPU or
RAM maxed out, neither is the disk IO capacity.

Repeated and interleaved runs should shake out any file system caching variability (which
I couldn’t observe anyway).

100 byte docs:

couchjs is ~10% faster than deno, while using ~60% less CPU (40% vs. 70%), RAM usage is rather
erratic, springs from 20MB to 180MB and back periodically. deno RAM grows very slowly, maxing
out at 110MB at the end of the run, so I presume whatever long-generational GC isn’t even
kicking in yet.

512 byte docs:

couchjs is ~5% faster than deno, same CPU and RAM profiles.

1024 byte docs:

couchis is 20% slower(!) than deno, same CPU and RAM profiles.

At 512 and 1024 byte docs, deno makes beam.smp work a little harder, about 5% CPU usage.

All of the runs take between 1 and 2 minutes, so longer-running impacts aren’t showing here.

As you can see, this is very unscientific, but gives us an interesting direction.

Depending on the workload, the deno query server *might* lead to faster indexing, on larger
docs, while making potentially better use of available CPU resources (or less euphemistically:
at the expense of using more CPU time), and with a lot more stable RAM profile.

Given that I was able to put this together relatively quickly, and deno is very new, I find
this rather promising.

In addition, since it is rather easy distributing this query server (install deno, download
the .js file, set an env var, done), this might be a nice community alternative to couchjs
for folks who see benefits.

I’d also like to see us taking this to the deno folks to see if they have anything up their
sleeve in terms of speeding up stdio, or if there are tricks we can pull on the JS side.

* * *

One more interesting point I didn’t mention last night: this is entirely in JS based on
an existing runtime. As opposed to couchjs, where we currently maintain a C and a C++ integration
layer that nobody likes touching.

A pure-JS implementation, and my cleaned up (albeit less feature-full) ~500LoC of relatively
modern JS might lead to renewed innovation in the space. Who doesn’t like a well-defined
performance game :)

Plus, it’d be interesting to see if the TypeScript compiler could add more optimisations
once the query server implementation is translated and type-annotated to be proper TypeScript.

Best
Jan
—
[1]: https://github.com/apache/couchdb/blob/master/test/bench/benchbulk.sh

> On 14. May 2020, at 22:01, Jan Lehnardt <jan@apache.org> wrote:
> 
> Hey all,
> 
> I got nerd sniped by Joan this morning:
> 
>    <+Wohali> hmmmmm. https://github.com/denoland/deno
>    <+Wohali> i know i know another runtime but it's focused on security
> 
> I wondered what it would take to make a couchjs variant based on deno. Turns out: about
a day if you cut some corners ;)
> 
> One of the interesting aspects, as Joan notes, is its more-secure-by-default, so I have
some hopes that this might work out better than our ill-fated nodejs query server experiment
from a few years back.
> 
> I started by hacking up a readily generated main.js, then ran `make` again, and did it
all again. Overall, it is ~30 LOC changes. Since there is no synchronous `readline()` available
and JS code can either by sync or async, we can’t make it so one source could run in our
couchjs or deno.
> 
> So I went ahead and ripped all the basics out of our main.js and modernised things a
little bit along the way. The result is a main-deno.js that can run map/reduce/rereduce/filter/view_filter/validate_doc_update
functions (as validated by the query server spec).
> 
>    https://gist.github.com/janl/c3139bc72efe663e35005d8864c4201f
> 
> I intentionally left out the couchappy functions, as at least lists with the `getRow()`
function won’t be implementable without an API break. I also left out legacy compact with
esprima/escodegen to keep things more manageable. Oh and no lib/modules, given today’s JS
packaging tooling, it’s an easy choice to leave out.
> 
> I haven’t done any sort of benchmarking, but I’d love for someone here to give this
a try. Here’s how to hack up `./dev/run` to add support for `deno` design docs:
> 
>   https://gist.github.com/janl/01559f8617ef44afd5ceec39ec8389e8
> 
> If you want to run this on a regular CouchDB setup, set up this env var before launching
CouchDB:
> 
>    COUCHDB_QUERY_SERVER_DENO="deno run --allow-write /path/to/main-deno.js”
> 
>    `--allow-write` is only required for the debug log (/tmp/deno-qs.log), but won’t
be required during operation, adding to the sandboxed nature of it all.
> 
> And some proof of operation:
> 
>   https://gist.github.com/janl/8636d469420a1fd2de481ae8f5780854
> 
> It’d be nice to see how stable this is in practice and if there are any meaningful
performance / resource-usage differences. Any takers? I’ll answer any and all setup questions.
> 
> Now I’m passing the nerd-snipe torch to Paul:
> 
>    <+jan____> uh, and it is embeddable https://deno.land/manual/embedding_deno
> 
> Best
> Jan
> —


Mime
View raw message