couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Shorin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COUCHDB-1986) 04-replication-large-atts.t times out
Date Wed, 26 Feb 2014 12:56:27 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13912841#comment-13912841
] 

Alexander Shorin commented on COUCHDB-1986:
-------------------------------------------

Ok, here I got some numbers for the update. Long post.

- FreeBSD 9.1 with Erlang R15B02, clean install (vbox, 512 MB, 2 CPU)
- FreeBSD 10 with Erlang R16B02, clean install (vbox, 512 MB, 2 CPU)

I ran 04-replication-large-atts.t test on all of them:

{code}
$ ./test/etap/run src/couch_replicator/test/04-replication-large-atts.t
{code}

And there is the results for the [master's HEAD|https://github.com/apache/couchdb/commit/5d17c204]:

- FreeBSD 9.1: {{Files=1, Tests=1192, 47 wallclock secs ( 0.19 usr  0.20 sys + 17.39 cusr
18.25 csys = 36.03 CPU)}}
- FreeBSD 10: Bailout called.  Further testing stopped:  Timeout waiting for replication to
finish


Not the news at all.

Then, I applied small patch to measure the response time:

{code}
diff --git a/src/couch_replicator/src/couch_replicator_httpc.erl b/src/couch_replicator/src/couch_replicator_httpc.erl
index b8fb31b..ff90456 100644
--- a/src/couch_replicator/src/couch_replicator_httpc.erl
+++ b/src/couch_replicator/src/couch_replicator_httpc.erl
@@ -61,8 +61,11 @@ send_ibrowse_req(#httpdb{headers = BaseHeaders} = HttpDb, Params) ->
         lists:ukeymerge(1, get_value(ibrowse_options, Params, []),
             HttpDb#httpdb.ibrowse_options)
     ],
-     Response = ibrowse:send_req_direct(
-         Worker, Url, Headers2, Method, Body, IbrowseOptions, infinity),
+    {_, E, _} = erlang:now(),
+    ?LOG_INFO("Made request ~p ~p in ~p secs ~n", [Method, Url, E - S]),
     {Worker, Response}.

{code}

Since the most heavies requests are attachments uploading, let's filter the output:

{code}
$ ./test/etap/run src/couch_replicator/test/04-replication-large-atts.t | egrep "Made request
(post|put)"
{code}

FreeBSD 9.1:

{code}
[info] [<0.345.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_a/_ensure_full_commit"
in 0.101148 secs 
[info] [<0.208.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_a/_local/ed1e1b8b6cc2aea9eaeb935affe17fdd"
in 0.194595 secs 
[info] [<0.351.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_a/_ensure_full_commit"
in 0.115996 secs 
[info] [<0.208.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_a/_local/ed1e1b8b6cc2aea9eaeb935affe17fdd"
in 0.19801 secs 
[info] [<0.376.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_revs_diff"
in 0.198322 secs 
[info] [<0.379.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_revs_diff"
in 0.196964 secs 
[info] [<0.378.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_revs_diff"
in 0.197843 secs 
[info] [<0.377.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_revs_diff"
in 0.198358 secs 
[info] [<0.377.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc8?new_edits=false"
in 0.995802 secs 
[info] [<0.378.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc7?new_edits=false"
in 0.996697 secs 
[info] [<0.376.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc9?new_edits=false"
in 0.998658 secs 
[info] [<0.379.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc6?new_edits=false"
in 0.998032 secs 
[info] [<0.377.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_revs_diff"
in 0.198479 secs 
[info] [<0.377.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc5?new_edits=false"
in 0.328846 secs 
[info] [<0.377.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc4?new_edits=false"
in 0.329292 secs 
[info] [<0.377.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc3?new_edits=false"
in 0.339167 secs 
[info] [<0.377.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc2?new_edits=false"
in 0.329193 secs 
[info] [<0.377.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc11?new_edits=false"
in 0.329162 secs 
[info] [<0.377.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc10?new_edits=false"
in 0.329075 secs 
[info] [<0.377.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc1?new_edits=false"
in 0.329311 secs 
[info] [<0.368.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_ensure_full_commit"
in 0.099283 secs 
[info] [<0.368.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/_local/76054f0aaccade46a4806b08a211bc16"
in 0.198862 secs 
[info] [<0.462.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_revs_diff"
in 0.199692 secs 
[info] [<0.466.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_revs_diff"
in 0.198908 secs 
[info] [<0.464.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_revs_diff"
in 0.199588 secs 
[info] [<0.468.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_revs_diff"
in 0.199018 secs 
[info] [<0.500.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc9?new_edits=false"
in 2.522225 secs 
[info] [<0.510.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc6?new_edits=false"
in 2.514634 secs 
[info] [<0.506.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc7?new_edits=false"
in 2.51673 secs 
[info] [<0.505.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc8?new_edits=false"
in 2.517766 secs 
[info] [<0.462.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_revs_diff"
in 0.197848 secs 
[info] [<0.616.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_a/_ensure_full_commit"
in 0.108526 secs 
[info] [<0.450.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_ensure_full_commit"
in 0.107361 secs 
[info] [<0.450.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_a/_local/e8009d39fc94c9267bf51e9863c46895"
in 0.202483 secs 
[info] [<0.450.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/_local/e8009d39fc94c9267bf51e9863c46895"
in 0.197966 secs 
[info] [<0.570.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc4?new_edits=false"
in 4.332465 secs 
[info] [<0.567.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc5?new_edits=false"
in 4.33544 secs 
[info] [<0.578.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc2?new_edits=false"
in 4.335835 secs 
[info] [<0.574.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc3?new_edits=false"
in 4.33976 secs 
[info] [<0.588.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc1?new_edits=false"
in 4.328189 secs 
[info] [<0.585.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc10?new_edits=false"
in 4.329808 secs 
[info] [<0.582.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/doc11?new_edits=false"
in 4.331474 secs 
[info] [<0.625.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_a/_ensure_full_commit"
in 0.099479 secs 
[info] [<0.450.0>] Made request post "http://127.0.0.1:64967/couch_test_rep_db_b/_ensure_full_commit"
in 0.099653 secs 
[info] [<0.450.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_a/_local/e8009d39fc94c9267bf51e9863c46895"
in 0.198995 secs 
[info] [<0.450.0>] Made request put "http://127.0.0.1:64967/couch_test_rep_db_b/_local/e8009d39fc94c9267bf51e9863c46895"
in 0.199502 secs 

{code}

Nothing looks criminal. 

FreeBSD 10:

{code}
# Creating target database
# Triggering replication
[info] [<0.309.0>] 127.0.0.1 - - HEAD /couch_test_rep_db_b/ 200
[info] [<0.370.0>] Made request head "http://127.0.0.1:11166/couch_test_rep_db_b/" in
0.004577 secs 
[info] [<0.309.0>] 127.0.0.1 - - GET /couch_test_rep_db_b/ 200
[info] [<0.370.0>] Made request get "http://127.0.0.1:11166/couch_test_rep_db_b/" in
0.100691 secs 
[info] [<0.309.0>] 127.0.0.1 - - GET /couch_test_rep_db_b/_local/2eaa8e3e950d010d82b20eff81e9f651
404
[info] [<0.370.0>] Made request get "http://127.0.0.1:11166/couch_test_rep_db_b/_local/2eaa8e3e950d010d82b20eff81e9f651"
in 0.098708 secs 
[info] [<0.309.0>] 127.0.0.1 - - GET /couch_test_rep_db_b/_local/38bbd51d509064584b986ecfc71e443a
404
[info] [<0.370.0>] Made request get "http://127.0.0.1:11166/couch_test_rep_db_b/_local/38bbd51d509064584b986ecfc71e443a"
in 0.099318 secs 
[info] [<0.309.0>] 127.0.0.1 - - GET /couch_test_rep_db_b/_local/eaecd74f7be532d2b82aa4c137cc0b11
404
[info] [<0.370.0>] Made request get "http://127.0.0.1:11166/couch_test_rep_db_b/_local/eaecd74f7be532d2b82aa4c137cc0b11"
in 0.099245 secs 
[info] [<0.370.0>] Replication `"2eaa8e3e950d010d82b20eff81e9f651"` is using:
	4 worker processes
	a worker batch size of 500
	20 HTTP connections
	a connection timeout of 30000 milliseconds
	10 retries per request
	socket options are: [{keepalive,true},{nodelay,false}]
[info] [<0.2.0>] starting new replication `2eaa8e3e950d010d82b20eff81e9f651` at <0.370.0>
(`couch_test_rep_db_a` -> `http://127.0.0.1:11166/couch_test_rep_db_b/`)
[info] [<0.309.0>] 127.0.0.1 - - POST /couch_test_rep_db_b/_revs_diff 200
[info] [<0.307.0>] 127.0.0.1 - - POST /couch_test_rep_db_b/_revs_diff 200
[info] [<0.310.0>] 127.0.0.1 - - POST /couch_test_rep_db_b/_revs_diff 200
[info] [<0.302.0>] 127.0.0.1 - - POST /couch_test_rep_db_b/_revs_diff 200
[info] [<0.378.0>] Made request post "http://127.0.0.1:11166/couch_test_rep_db_b/_revs_diff"
in 0.198793 secs 
[info] [<0.380.0>] Made request post "http://127.0.0.1:11166/couch_test_rep_db_b/_revs_diff"
in 0.200831 secs 
[info] [<0.379.0>] Made request post "http://127.0.0.1:11166/couch_test_rep_db_b/_revs_diff"
in 0.20114 secs 
[info] [<0.381.0>] Made request post "http://127.0.0.1:11166/couch_test_rep_db_b/_revs_diff"
in 0.20127 secs 
[info] [<0.310.0>] 127.0.0.1 - - PUT /couch_test_rep_db_b/doc8?new_edits=false 201
[info] [<0.309.0>] 127.0.0.1 - - PUT /couch_test_rep_db_b/doc9?new_edits=false 201
[info] [<0.302.0>] 127.0.0.1 - - PUT /couch_test_rep_db_b/doc6?new_edits=false 201
[info] [<0.307.0>] 127.0.0.1 - - PUT /couch_test_rep_db_b/doc7?new_edits=false 201
[info] [<0.378.0>] Made request put "http://127.0.0.1:11166/couch_test_rep_db_b/doc9?new_edits=false"
in 116.229091 secs 
[info] [<0.379.0>] Made request put "http://127.0.0.1:11166/couch_test_rep_db_b/doc8?new_edits=false"
in 116.223251 secs 
[info] [<0.381.0>] Made request put "http://127.0.0.1:11166/couch_test_rep_db_b/doc6?new_edits=false"
in 116.226021 secs 
[info] [<0.380.0>] Made request put "http://127.0.0.1:11166/couch_test_rep_db_b/doc7?new_edits=false"
in 116.227487 secs 
[info] [<0.310.0>] 127.0.0.1 - - POST /couch_test_rep_db_b/_revs_diff 200
[info] [<0.378.0>] Made request post "http://127.0.0.1:11166/couch_test_rep_db_b/_revs_diff"
in 0.198398 secs 
[info] [<0.307.0>] 127.0.0.1 - - POST /couch_test_rep_db_b/_ensure_full_commit 201
[info] [<0.370.0>] Made request post "http://127.0.0.1:11166/couch_test_rep_db_b/_ensure_full_commit"
in 0.099793 secs 
[info] [<0.370.0>] recording a checkpoint for `couch_test_rep_db_a` -> `http://127.0.0.1:11166/couch_test_rep_db_b/`
at source update_seq 3
[info] [<0.307.0>] 127.0.0.1 - - PUT /couch_test_rep_db_b/_local/2eaa8e3e950d010d82b20eff81e9f651
201
[info] [<0.370.0>] Made request put "http://127.0.0.1:11166/couch_test_rep_db_b/_local/2eaa8e3e950d010d82b20eff81e9f651"
in 0.20571 secs 
[info] [<0.310.0>] 127.0.0.1 - - PUT /couch_test_rep_db_b/doc5?new_edits=false 201
[info] [<0.378.0>] Made request put "http://127.0.0.1:11166/couch_test_rep_db_b/doc5?new_edits=false"
in 115.208404 secs 
Bailout called.  Further testing stopped:  Timeout waiting for replication to finish
Bail out! Timeout waiting for replication to finish
FAILED--Further testing stopped: Timeout waiting for replication to finish
{code}


Hold on a second: 115 seconds for pushing single doc to localhost. Btw, on 1.6.x branch this
time equals to 120 secs so Nicks patch really works, but doesn't actually solves the issue.

Ok, to make testing more fair let's try use Erlang R15B03 on FreeBSD 10:

{code}
# Creating target database
# Triggering replication
[info] [<0.307.0>] 127.0.0.1 - - HEAD /couch_test_rep_db_b/ 200
[info] [<0.370.0>] Made request head "http://127.0.0.1:51750/couch_test_rep_db_b/" in
0.003978 secs 
[info] [<0.307.0>] 127.0.0.1 - - GET /couch_test_rep_db_b/ 200
[info] [<0.370.0>] Made request get "http://127.0.0.1:51750/couch_test_rep_db_b/" in
0.098668 secs 
[info] [<0.307.0>] 127.0.0.1 - - GET /couch_test_rep_db_b/_local/1fef0ca5eb7ea55b37727c12cbb474b6
404
[info] [<0.370.0>] Made request get "http://127.0.0.1:51750/couch_test_rep_db_b/_local/1fef0ca5eb7ea55b37727c12cbb474b6"
in 0.107729 secs 
[info] [<0.307.0>] 127.0.0.1 - - GET /couch_test_rep_db_b/_local/0344c3dab39f52994346ee2f286a109c
404
[info] [<0.370.0>] Made request get "http://127.0.0.1:51750/couch_test_rep_db_b/_local/0344c3dab39f52994346ee2f286a109c"
in 0.09909 secs 
[info] [<0.307.0>] 127.0.0.1 - - GET /couch_test_rep_db_b/_local/2c46e037f893bea49f746b1c89fb073b
404
[info] [<0.370.0>] Made request get "http://127.0.0.1:51750/couch_test_rep_db_b/_local/2c46e037f893bea49f746b1c89fb073b"
in 0.099404 secs 
[info] [<0.370.0>] Replication `"1fef0ca5eb7ea55b37727c12cbb474b6"` is using:
	4 worker processes
	a worker batch size of 500
	20 HTTP connections
	a connection timeout of 30000 milliseconds
	10 retries per request
	socket options are: [{keepalive,true},{nodelay,false}]
[info] [<0.2.0>] starting new replication `1fef0ca5eb7ea55b37727c12cbb474b6` at <0.370.0>
(`couch_test_rep_db_a` -> `http://127.0.0.1:51750/couch_test_rep_db_b/`)
[info] [<0.307.0>] 127.0.0.1 - - POST /couch_test_rep_db_b/_revs_diff 200
[info] [<0.306.0>] 127.0.0.1 - - POST /couch_test_rep_db_b/_revs_diff 200
[info] [<0.303.0>] 127.0.0.1 - - POST /couch_test_rep_db_b/_revs_diff 200
[info] [<0.311.0>] 127.0.0.1 - - POST /couch_test_rep_db_b/_revs_diff 200
[info] [<0.382.0>] Made request post "http://127.0.0.1:51750/couch_test_rep_db_b/_revs_diff"
in 0.19943 secs 
[info] [<0.379.0>] Made request post "http://127.0.0.1:51750/couch_test_rep_db_b/_revs_diff"
in 0.201037 secs 
[info] [<0.380.0>] Made request post "http://127.0.0.1:51750/couch_test_rep_db_b/_revs_diff"
in 0.200908 secs 

[info] [<0.381.0>] Made request post "http://127.0.0.1:51750/couch_test_rep_db_b/_revs_diff"
in 0.200642 secs 
[info] [<0.311.0>] 127.0.0.1 - - PUT /couch_test_rep_db_b/doc9?new_edits=false 201
[info] [<0.303.0>] 127.0.0.1 - - PUT /couch_test_rep_db_b/doc7?new_edits=false 201
[info] [<0.307.0>] 127.0.0.1 - - PUT /couch_test_rep_db_b/doc8?new_edits=false 201
[info] [<0.306.0>] 127.0.0.1 - - PUT /couch_test_rep_db_b/doc6?new_edits=false 201
[info] [<0.379.0>] Made request put "http://127.0.0.1:51750/couch_test_rep_db_b/doc9?new_edits=false"
in 115.210925 secs 
[info] [<0.381.0>] Made request put "http://127.0.0.1:51750/couch_test_rep_db_b/doc7?new_edits=false"
in 115.212048 secs 
[info] [<0.382.0>] Made request put "http://127.0.0.1:51750/couch_test_rep_db_b/doc6?new_edits=false"
in 115.21568 secs 
[info] [<0.380.0>] Made request put "http://127.0.0.1:51750/couch_test_rep_db_b/doc8?new_edits=false"
in 115.216761 secs 
[info] [<0.311.0>] 127.0.0.1 - - POST /couch_test_rep_db_b/_revs_diff 200
[info] [<0.381.0>] Made request post "http://127.0.0.1:51750/couch_test_rep_db_b/_revs_diff"
in 0.198427 secs 
[info] [<0.307.0>] 127.0.0.1 - - POST /couch_test_rep_db_b/_ensure_full_commit 201
[info] [<0.370.0>] Made request post "http://127.0.0.1:51750/couch_test_rep_db_b/_ensure_full_commit"
in 0.107235 secs 
[info] [<0.370.0>] recording a checkpoint for `couch_test_rep_db_a` -> `http://127.0.0.1:51750/couch_test_rep_db_b/`
at source update_seq 2
[info] [<0.307.0>] 127.0.0.1 - - PUT /couch_test_rep_db_b/_local/1fef0ca5eb7ea55b37727c12cbb474b6
201
[info] [<0.370.0>] Made request put "http://127.0.0.1:51750/couch_test_rep_db_b/_local/1fef0ca5eb7ea55b37727c12cbb474b6"
in 0.206531 secs 
[info] [<0.311.0>] 127.0.0.1 - - PUT /couch_test_rep_db_b/doc5?new_edits=false 201
[info] [<0.381.0>] Made request put "http://127.0.0.1:51750/couch_test_rep_db_b/doc5?new_edits=false"
in 115.208751 secs 
Bailout called.  Further testing stopped:  Timeout waiting for replication to finish
Bail out! Timeout waiting for replication to finish
FAILED--Further testing stopped: Timeout waiting for replication to finish
{code}

Still the same picture. Ok, so the problem could be either in FreeBSD 10 default configuration
or in our code. Let's assume that it's the latter.

First, it looks like client (ibrowser) works too slow. On previous debug I found that [gen_tcp:send()|https://github.com/apache/couchdb/blob/master/src/ibrowse/ibrowse_http_client.erl#L560]
takes about 200 milliseconds to send chunk with 8192 bytes to server. Not really sure what
I tried to find by searching 8192 string in the source code, but I found that exactly the
same numbers are used by mochiweb to define read buffer chunk size. So that's the idea: what
will happens if I increase it?

{code}
diff --git a/src/mochiweb/internal.hrl b/src/mochiweb/internal.hrl
index 6db899a..338edca 100644
--- a/src/mochiweb/internal.hrl
+++ b/src/mochiweb/internal.hrl
@@ -1,3 +1,3 @@
 
--define(RECBUF_SIZE, 8192).
+-define(RECBUF_SIZE, 8192 * 4).

{code}


After the patch:

FreeBSD 9.1

{code}
[info] [<0.343.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_a/_ensure_full_commit"
in 0.098174 secs 
[info] [<0.206.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_a/_local/5852f11ad174ce93069e8d2f2da026fc"
in 0.197739 secs 
[info] [<0.347.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_a/_ensure_full_commit"
in 0.106469 secs 
[info] [<0.206.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_a/_local/5852f11ad174ce93069e8d2f2da026fc"
in 0.199129 secs 
[info] [<0.372.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_revs_diff"
in 0.198386 secs 
[info] [<0.375.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_revs_diff"
in 0.196966 secs 
[info] [<0.374.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_revs_diff"
in 0.197865 secs 
[info] [<0.373.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_revs_diff"
in 0.198386 secs 
[info] [<0.373.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc8?new_edits=false"
in 1.035908 secs 
[info] [<0.374.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc7?new_edits=false"
in 1.036799 secs 
[info] [<0.372.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc9?new_edits=false"
in 1.038673 secs 
[info] [<0.375.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc6?new_edits=false"
in 1.038161 secs 
[info] [<0.373.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_revs_diff"
in 0.198448 secs 
[info] [<0.373.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc5?new_edits=false"
in 0.339051 secs 
[info] [<0.373.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc4?new_edits=false"
in 0.329205 secs 
[info] [<0.373.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc3?new_edits=false"
in 0.339112 secs 
[info] [<0.373.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc2?new_edits=false"
in 0.329233 secs 
[info] [<0.373.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc11?new_edits=false"
in 0.368982 secs 
[info] [<0.373.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc10?new_edits=false"
in 0.329167 secs 
[info] [<0.373.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc1?new_edits=false"
in 0.389066 secs 
[info] [<0.364.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_ensure_full_commit"
in 0.1194 secs 
[info] [<0.364.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/_local/85777ae3396849e65c872e08a7b42144"
in 0.198753 secs 
[info] [<0.459.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_revs_diff"
in 0.199771 secs 
[info] [<0.463.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_revs_diff"
in 0.198931 secs 
[info] [<0.461.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_revs_diff"
in 0.199617 secs 
[info] [<0.465.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_revs_diff"
in 0.198968 secs 
[info] [<0.496.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc9?new_edits=false"
in 2.842012 secs 
[info] [<0.505.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc6?new_edits=false"
in 2.835165 secs 
[info] [<0.504.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc7?new_edits=false"
in 2.836418 secs 
[info] [<0.500.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc8?new_edits=false"
in 2.83799 secs 
[info] [<0.459.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_revs_diff"
in 0.197864 secs 
[info] [<0.446.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_ensure_full_commit"
in 0.106626 secs 
[info] [<0.613.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_a/_ensure_full_commit"
in 0.108089 secs 
[info] [<0.446.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_a/_local/91cabaf2492f2bd2a4ca92a53f02ca2f"
in 0.203431 secs 
[info] [<0.446.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/_local/91cabaf2492f2bd2a4ca92a53f02ca2f"
in 0.19678 secs 
[info] [<0.574.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc2?new_edits=false"
in 4.516424 secs 
[info] [<0.570.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc3?new_edits=false"
in 4.520756 secs 
[info] [<0.566.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc4?new_edits=false"
in 4.523575 secs 
[info] [<0.563.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc5?new_edits=false"
in 4.527971 secs 
[info] [<0.584.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc1?new_edits=false"
in 4.510487 secs 
[info] [<0.581.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc10?new_edits=false"
in 4.512293 secs 
[info] [<0.578.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/doc11?new_edits=false"
in 4.522901 secs 
[info] [<0.622.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_a/_ensure_full_commit"
in 0.099445 secs 
[info] [<0.446.0>] Made request post "http://127.0.0.1:14927/couch_test_rep_db_b/_ensure_full_commit"
in 0.099696 secs 
[info] [<0.446.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_a/_local/91cabaf2492f2bd2a4ca92a53f02ca2f"
in 0.198469 secs 
[info] [<0.446.0>] Made request put "http://127.0.0.1:14927/couch_test_rep_db_b/_local/91cabaf2492f2bd2a4ca92a53f02ca2f"
in 0.199494 secs 
{code}

Nothing changed at all.

FreeBSD 10:

{code}
$ ./test/etap/run src/couch_replicator/test/04-replication-large-atts.t
src/couch_replicator/test/04-replication-large-atts.t .. 
...
All tests successful.
Files=1, Tests=1192, 47 wallclock secs ( 0.43 usr  0.09 sys + 33.72 cusr 13.70 csys = 47.95
CPU)
Result: PASS
{code}
    
*Whoa!* 

Part of output with POST/PUT requests:

{code}
info] [<0.349.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_a/_ensure_full_commit"
in 0.094218 secs 
[info] [<0.205.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_a/_local/05e6f7426223b3f0da8371efda4fa1a6"
in 0.198086 secs 
[info] [<0.352.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_a/_ensure_full_commit"
in 0.109214 secs 
[info] [<0.205.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_a/_local/05e6f7426223b3f0da8371efda4fa1a6"
in 0.198422 secs 
[info] [<0.378.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_revs_diff"
in 0.195682 secs 
[info] [<0.379.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_revs_diff"
in 0.195524 secs 
[info] [<0.381.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_revs_diff"
in 0.195412 secs 
[info] [<0.380.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_revs_diff"
in 0.195646 secs 
[info] [<0.381.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc6?new_edits=false"
in 0.706663 secs 
[info] [<0.380.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc7?new_edits=false"
in 0.754916 secs 
[info] [<0.379.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc8?new_edits=false"
in 0.755129 secs 
[info] [<0.378.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc9?new_edits=false"
in 0.871189 secs 
[info] [<0.381.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_revs_diff"
in 0.203772 secs 
[info] [<0.381.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc5?new_edits=false"
in 0.378326 secs 
[info] [<0.381.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc4?new_edits=false"
in 0.409232 secs 
[info] [<0.381.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc3?new_edits=false"
in 0.418802 secs 
[info] [<0.381.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc2?new_edits=false"
in 0.399084 secs 
[info] [<0.381.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc11?new_edits=false"
in 0.404278 secs 
[info] [<0.381.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc10?new_edits=false"
in 0.323305 secs 
[info] [<0.381.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc1?new_edits=false"
in 0.359326 secs 
[info] [<0.369.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_ensure_full_commit"
in 0.108647 secs 
[info] [<0.369.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/_local/76d88aed4242930e5cc84e6f22fb5cc7"
in 0.206168 secs 
[info] [<0.465.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_revs_diff"
in 0.199665 secs 
[info] [<0.467.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_revs_diff"
in 0.199372 secs 
[info] [<0.471.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_revs_diff"
in 0.199052 secs 
[info] [<0.469.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_revs_diff"
in 0.199648 secs 
[info] [<0.502.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc9?new_edits=false"
in 1.746967 secs 
[info] [<0.507.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc8?new_edits=false"
in 1.879406 secs 
[info] [<0.513.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc7?new_edits=false"
in 1.890523 secs 
[info] [<0.465.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_revs_diff"
in 0.203363 secs 
[info] [<0.506.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc6?new_edits=false"
in 1.993508 secs 
[info] [<0.623.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_a/_ensure_full_commit"
in 0.100955 secs 
[info] [<0.452.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_ensure_full_commit"
in 0.101679 secs 
[info] [<0.452.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_a/_local/f3b6cede324c784647b987424bb0aa7c"
in 0.23693 secs 
[info] [<0.452.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/_local/f3b6cede324c784647b987424bb0aa7c"
in 0.191251 secs 
[info] [<0.579.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc1?new_edits=false"
in 3.288234 secs 
[info] [<0.569.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc5?new_edits=false"
in 3.306066 secs 
[info] [<0.590.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc11?new_edits=false"
in 3.319111 secs 
[info] [<0.572.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc4?new_edits=false"
in 3.343141 secs 
[info] [<0.576.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc2?new_edits=false"
in 3.352749 secs 
[info] [<0.584.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc10?new_edits=false"
in 3.480288 secs 
[info] [<0.594.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/doc3?new_edits=false"
in 3.478705 secs 
[info] [<0.452.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_b/_ensure_full_commit"
in 0.109566 secs 
[info] [<0.628.0>] Made request post "http://127.0.0.1:27440/couch_test_rep_db_a/_ensure_full_commit"
in 0.109595 secs 
[info] [<0.452.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_a/_local/f3b6cede324c784647b987424bb0aa7c"
in 0.198072 secs 
[info] [<0.452.0>] Made request put "http://127.0.0.1:27440/couch_test_rep_db_b/_local/f3b6cede324c784647b987424bb0aa7c"
in 0.199271 secs 
{code}

Looks quite like as expected. But why?

[~paul.joseph.davis] suggested me to track down the place where {{?RECVBUF_SIZE}} matters
- that's actually socket base options [definition|https://github.com/apache/couchdb/blob/master/src/mochiweb/mochiweb_socket_server.erl#L168]
- and wrap with timer {{gen_tcp:recv/3}} calls for mochiweb. The result was pretty interesting.

Patch for logging:

{code}
diff --git a/src/ibrowse/ibrowse_http_client.erl b/src/ibrowse/ibrowse_http_client.erl
index a1cf6eb..6d05f29 100644
--- a/src/ibrowse/ibrowse_http_client.erl
+++ b/src/ibrowse/ibrowse_http_client.erl
@@ -557,7 +557,11 @@ do_send(Req, #state{socket = Sock,
                     use_http_proxy = true,
                     proxy_tunnel_setup = Pts}) when Pts /= done ->  gen_tcp:send(Sock,
Req);
 do_send(Req, #state{socket = Sock, is_ssl = true})  ->  ssl:send(Sock, Req);
-do_send(Req, #state{socket = Sock, is_ssl = false}) ->  gen_tcp:send(Sock, Req).
+do_send(Req, #state{socket = Sock, is_ssl = false}) ->  
+  {Delay, Value} = timer:tc(gen_tcp, send, [Sock, Req]),
+  io:fwrite("Sent ~p bytes in ~p sec ~n", [size(iolist_to_binary(Req)), Delay / 1000000]),
+  Value.
+
 
 %% @spec do_send_body(Sock::socket_descriptor(), Source::source_descriptor(), IsSSL::boolean())
-> ok | error()
 %% source_descriptor() = fun_arity_0           |

diff --git a/src/mochiweb/mochiweb_socket.erl b/src/mochiweb/mochiweb_socket.erl
index ad27204..8917f06 100644
--- a/src/mochiweb/mochiweb_socket.erl
+++ b/src/mochiweb/mochiweb_socket.erl
@@ -44,7 +44,9 @@ after_accept(_Socket) -> ok.
 recv({ssl, Socket}, Length, Timeout) ->
     ssl:recv(Socket, Length, Timeout);
 recv(Socket, Length, Timeout) ->
-    gen_tcp:recv(Socket, Length, Timeout).
+    {Time, Value} = timer:tc(gen_tcp, recv, [Socket, Length, Timeout]),
+    io:fwrite("  mochiweb_socket:recv / gen_tcp:recv(Socket, ~p, ~p) -- ~p sec ~n", [Length,
Timeout, Time / 1000000]),
+    Value.
 
 send({ssl, Socket}, Data) ->
     ssl:send(Socket, Data);
{code}


And here is the output:

- [Factor 1 (RECBUF == 8192)| https://www.friendpaste.com/7IH6C4z31Lu5ZvCpz3svZa]
- [Factor 2 (RECBUF == 16384)| https://www.friendpaste.com/7IH6C4z31Lu5ZvCpz3sug7]
- [Factor 4 (RECBUF == 32768)| https://www.friendpaste.com/7fmjcrJpF3BobxbzNpapEe] - blindly
fast
- [Factor 1/2 (RECBUF == 4096)| https://www.friendpaste.com/7fmjcrJpF3BobxbzNpaniM] - recbuf
== read chunk size

So, actually this problem could be easily solved by changing [socket options|http://docs.couchdb.org/en/latest/config/http.html#httpd/socket_options]
in config file with adding {{\{recbuf,32768\}}} value to the list (or by using one from example
without patching anything.

But this looks as workaround, isn't so? The question why so is remains opened.

> 04-replication-large-atts.t times out
> -------------------------------------
>
>                 Key: COUCHDB-1986
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1986
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 1.5.0
>            Reporter: Jan Lehnardt
>
> 04-replication-large-atts.t gets stuck around 558, sometimes a little earlier or later,
but it times out eventually, regardless of the timeout. I tried doubling and such.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message