couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Randall Leeds (JIRA)" <j...@apache.org>
Subject [jira] Updated: (COUCHDB-767) do a non-blocking file:sync
Date Tue, 08 Jun 2010 06:58:13 GMT

     [ https://issues.apache.org/jira/browse/COUCHDB-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Randall Leeds updated COUCHDB-767:
----------------------------------

    Attachment: async_fsync.patch

Here's my attempt to combine Adam's patch with my patch from COUCHDB-786. In this approach,
couch_file exports a sync_file/1 which takes a path instead of a file descriptor. Unlike Adam's
patch, the From tag from the handle_call(sync, From, File) that spawns the async sync_file/1
is not passed, but the spawn'd fun replies with the result of sync_file/1. The desirable consequence
is that sync_file/1 may be called without going through a gen_server handler. This approach
allows the couch_db_updater:commit_data/2 function to call couch_file:sync_file/1 directly,
bypassing the gen_server:call operations that COUCHDB-786 was trying to avoid.

I think this is win win, but I agree with Adam about testing. I welcome any comprehensive
performance suite that could run against these changes to get some detailed statistics.

> do a non-blocking file:sync
> ---------------------------
>
>                 Key: COUCHDB-767
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-767
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>    Affects Versions: 0.11
>            Reporter: Adam Kocoloski
>             Fix For: 1.1
>
>         Attachments: 767-async-fsync.patch, async_fsync.patch
>
>
> I've been taking a close look at couch_file performance in our production systems.  One
of things I've noticed is that reads are occasionally blocked for a long time by a slow call
to file:sync.  I think this is unnecessary.  I think we could do something like
> handle_call(sync, From, #file{name=Name}=File) ->
>     spawn_link(fun() -> sync_file(Name, From) end),
>     {noreply, File};
> and then
> sync_file(Name, From) ->
>     {ok, Fd} = file:open(Name, [read, raw]),
>     gen_server:reply(From, file:sync(Fd)),
>     file:close(Fd).
> Does anyone see a downside to this?  Individual clients of couch_file still see exactly
the same behavior as before, only readers are not blocked by syncs initiated in the db_updater
process.  When data needs to be flushed file:sync is _much_ slower than spawning a local process
and opening the file again --  in the neighborhood of 1000x slower even on Linux with its
less-than-durable use of vanilla fsync.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message