couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filipe David Manana <fdman...@apache.org>
Subject About possibly reverting COUCHDB-767
Date Sun, 07 Nov 2010 16:35:42 GMT
Hi,

Regarding the change introduced by the ticket:

https://issues.apache.org/jira/browse/COUCHDB-767

(opening the same file in a different process and call fsync on the
new file descriptor through the new process)

I found out that it's not a recomended practice. I posted the
following question to the ext4 development mailing list:

http://www.spinics.net/lists/linux-ext4/msg21388.html

Also, with this patch I verified (on Solaris, with the 'zpool iostat
1' command) that when running a writes only test with relaximation
(200 write processes), disk write activity is not continuous. Without
this patch, there's continuous (every 1 second) write activity. This
also makes performance comparison tests with relaximation much harder
to analyse, as the peak variation is much higher and not periodic.

For the goal of not having readers getting blocked by fsync calls (and
write calls), I would propose using a separate couch_file process just
for read operations. I have a branch in my github for this (with
COUCHDB-767 reverted). It needs to be polished, but the relaximation
tests are very positive, both reads and writes get better response
times and throughput:

https://github.com/fdmanana/couchdb/tree/2_couch_files_no_batch_reads

http://graphs.mikeal.couchone.com/#/graph/62b286fbb7aa55a4b0c4cc913c00e659
  (relaximation test)

The test does a direct comparsion with trunk (also with COUCHDB-767
reverted) and was run like this:

$ node tests/compare_write_and_read.js --wclients 300 --rclients 150 \
    -name1 2_couch_files_no_batch_reads -name2 trunk \
    -url1 http://localhost:5984/ -url2 http://localhost:5985/ \
    --duration 120

This approach, of using a file descriptor just for reads and another
one just for writes (but both referring to the same file), seems to be
safe:

http://www.spinics.net/lists/linux-ext4/msg21429.html

Race conditions shouldn't happen, as each read call needs an offset,
and the offset is only known after a previous write calls finished.


Thoughts on this? Should we revert COUCHDB-767? Integrate the 2
couch_files strategy into trunk? (only after 1.1 has its own branch)

cheers

-- 
Filipe David Manana,
fdmanana@gmail.com, fdmanana@apache.org

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."

Mime
View raw message