incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Candler <B.Cand...@pobox.com>
Subject Re: View Performance (was Re: The 1.0 Thread)
Date Thu, 02 Jul 2009 13:20:47 GMT
My very simple benchmarking script is attached. I have run it on a 1.2GHz
Thinkpad X30 laptop running Ubuntu Jaunty, stock Erlang 12B5, stock
ruby-1.8.7, and CouchDB 0.10.0a787397. Both the benchmark client and CouchDB
are running on the same machine.

A quick look at the results:

  *** Testing view performance as function of doc padding size
  --- doc padded with 1 bytes
  Insert 3000 simple documents...                                1.396 secs
  Retrieve documents...                                          2.092 secs
  Full view...                                                   3.252 secs
  View with ?limit=1...                                          2.877 secs
  --- doc padded with 1000 bytes
  Insert 3000 simple documents...                                3.232 secs
  Retrieve documents...                                          3.475 secs
  Full view...                                                   8.497 secs
  View with ?limit=1...                                          8.161 secs
  --- doc padded with 2000 bytes
  Insert 3000 simple documents...                                4.985 secs
  Retrieve documents...                                          4.777 secs
  Full view...                                                  10.412 secs
  View with ?limit=1...                                          9.963 secs

In this test each document emits only an integer key and a null value. The
padding exists in the source document but not in the emitted key/value, so
is an attempt to measure the overhead of JSON-encoding the docs and sending
them to the view server only, since the emits are the same.

The difference in time for the full view versus ?limit=1 is very small,
showing the JSON encoding and HTTP transfer overhead to be small. The figure
with ?limit=1 is the time to index the documents, without sending them back
to the client.

Adding 1000 then 2000 bytes to each document linearly increases the time to
insert (~1.8s per 1000*3000 bytes) and the time to retrieve (~1.4s).

Strangely there is a jump in the indexing time: 5.3s for the first
additional 1000*3000 bytes, but only 1.8s for the next. But even taking this
lower figure, this suggests a transfer rate of only 1.7MB/s between Erlang
and the Javascript view server, including JSON serialisation and
deserialisation overhead of course. I wonder which end is the bottleneck?
Comparing different view servers and an integrated Erlang view server would
be very interesting.

  *** Performance as function of K/V pairs emitted (simple keys)
  Insert 3000 simple documents...                                1.450 secs
  1 K/V pairs per doc...                                         2.899 secs
  2 K/V pairs per doc...                                         4.136 secs
  3 K/V pairs per doc...                                         5.324 secs
  4 K/V pairs per doc...                                         6.459 secs
  5 K/V pairs per doc...                                         7.693 secs
  10 K/V pairs per doc...                                       13.522 secs

  *** Performance as function of K/V pairs emitted (compound keys)
  Insert 3000 compound documents...                              1.486 secs
  1 K/V pairs per doc...                                         3.575 secs
  2 K/V pairs per doc...                                         5.308 secs
  3 K/V pairs per doc...                                         6.926 secs
  4 K/V pairs per doc...                                         8.706 secs
  5 K/V pairs per doc...                                        10.471 secs
  10 K/V pairs per doc...                                       19.049 secs

This test attempts to measure the speed of emitting K/V pairs and inserting
them into the view. The views are all queried with ?limit=1 so that only the
indexing time is measured.

In the first run, simple integer keys are emitted. In the second, keys of
the form [x,y,z] are emitted, mixing strings, null and integers. I then vary
the number of emits per document in the design document. In both cases null
values are emitted.

This grows linearly. It takes an additional 1.18s for each 3000 simple keys,
and an additional 1.72s for each 3000 compound keys. This suggests a peak
emit and insert rate of 2540 simple keys or 1740 compound keys per second.

  *** Testing view performance as function of number of views
  Insert 3000 simple documents...                                1.391 secs
  Temp view...                                                   3.521 secs
  Temp view...                                                   0.553 secs
  Temp view...                                                   0.629 secs
  1 real view...                                                 3.267 secs
  2 identical real views...                                      3.248 secs
  3 identical real views...                                      3.961 secs
  4 identical real views,limit=1...                              3.077 secs
  2 different real views...                                      5.162 secs
  3 different real views...                                      6.872 secs
    second view in same ddoc...                                  0.375 secs
    third view in same ddoc...                                   0.372 secs
  1 real + 1 dummy views...                                      3.799 secs
  1 real + 2 dummy views...                                      4.027 secs
  1 real + 3 dummy views...                                      4.418 secs
    dummy view in same ddoc...                                   0.024 secs

Here we can see CouchDB is quite clever with view handling. Firstly, if you
submit exactly the same request repeatedly to a temp view, the old view is
reused. Then if you create a design document where multiple views have
exactly the same map code, only a single index is created.

However, views which are slightly different are processed differently. Each
different real view (one which emits k/v pairs) adds about 1.8 secs to the
overall time. In this case I didn't add limit=1 except where noted, but
since only one view is being downloaded, the 1.8 secs is the overhead of
creating the additional view.

Dummy views are "function(doc){}". Each one added linearly adds 0.4 secs to
the overall time. This is the time for the Javascript engine to pass all the
docs to an additional view function which does nothing.

  *** Testing view performance with reduce functions (simple keys)
  Insert 3000 simple documents...                                1.392 secs
  no reduce...                                                   2.897 secs
  null reduce...                                                 3.606 secs
  counter reduce...                                              3.659 secs
  min/max reduce...                                              3.951 secs
  banded reduce...                                               3.861 secs

  *** Testing view performance with reduce functions (compound keys)
  Insert 3000 compound documents...                              1.565 secs
  no reduce...                                                   4.883 secs
  null reduce...                                                 5.831 secs
  counter reduce...                                              5.881 secs
  min/max reduce...                                              6.578 secs
  banded reduce...                                               6.290 secs

Reduce functions seem to behave well: the overhead of a reduce is
significantly less than the map, adding only 0.7-0.9s to reduce 3000
documents, versus 2.9-4.9s to map them. This is interesting. It may reflect:

(a) the overhead of inserting the results into the Btree (one per document
in the case of map, but only one per group of documents in the case of
reduce);

and/or

(b) the fact that the K/V pairs are sent in batches to the reduce function,
for a single context switch, versus sending documents individually to the
map function (the "map_doc" call in main.js). It may be worth experimenting
with a "map_docs" call to send groups of documents for mapping.

In real life, the mapped items are likely to be smaller than the original
docs. However in this test the original docs themselves are also very small,
so I doubt there is much difference in this regard.

So to look at the storage and transfer overhead, I added a dummy value to
each emit (the value is a 32-byte string, instead of null)

  *** Testing view performance with reduce (compound key, emit pad value)
  Insert 3000 compound documents...                              1.497 secs
  no reduce...                                                   5.150 secs
  null reduce...                                                 6.327 secs
  counter reduce...                                              6.407 secs
  min/max reduce...                                              7.309 secs
  banded reduce...                                               6.914 secs

The mapping time has gone up by ~0.3s, and the reduce time by ~0.6s. I would
need to send a larger chunk or iterate more times to make this an accurate
measurement. But this seems reasonable: for map the larger data is
transferred once - from JS engine to Couch - and then inserted into Btree.
For reduce the larger data is transferred twice - from JS engine to Couch
(map output) and from Couch to JS (reduce input) where in this case it is
discarded. This suggests that the system is reasonably well balanced in
terms of encoding/decoding in both directions.

OK, well I'm not sure how much this adds to the debate :-) But perhaps
someone can read more into this than me.

Regards,

Brian.

Mime
View raw message