couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Geir Magnusson Jr." <>
Subject Re: Faster updates, optional ACID
Date Mon, 05 Jan 2009 23:46:42 GMT

On Jan 5, 2009, at 3:04 PM, Damien Katz wrote:

> On Jan 5, 2009, at 2:51 PM, Geir Magnusson Jr. wrote:
>> On Jan 5, 2009, at 2:32 PM, Damien Katz wrote:
>>> If necessary and possible, we'll patch the Erlang VM.
>> That seems like a bad idea to me - I'd think you'd want to stay out  
>> of the VM business.
> No, I mean send patches to the maintainers of Erlang to fix any  
> problems on their supported platforms.  Just like the F_FULLFSYNC  
> patch.

Ah.  Whew :)

>>> But if a platform doesn't support proper flushing, then it's not a  
>>> platform that can support an ACID database.
>> We're not communicating well here.
>> "proper flushing" depends on what you want to do - if you need your  
>> data to in confirmed permanent storage so that it can survive a  
>> crash or power cut, then w/o special configuration (e.g. battery- 
>> backed RAID, for example), I don't think that you're going to get  
>> assurance on linux.
>> Do you see what I'm saying?
> Yes I see what you are saying. Can you show that Linux doesn't  
> actually safely push the bits to disk in popular distros? If that's  
> the case, then we need to find the APIs that actually work and call  
> them, and if they don't work, we don't support Linux.

It pushes the bits to the disk drive, but that's where it's sphere of  
effect ends - what the drive does after that is drive specific.   
Drives cache writes to aggregate, or write things out of order based  
on head location, etc.

This isn't something that only affects Couch.

So I would say that .... it's time to relax.

Take the approach that you have a few modes

  a) fsync() mode - for people that care about true durability, it's  
up to them to get or configure drives to behave right, or whatever
  b) the delayed write mode so that you can do things like aggregate  
writes into clocked fsyncs or something  (I'd use this - I'll take the  
performance trade for durability)
  c) and for platforms that offer special modes that really do  
guarantee the write all the way to the physical media, like OS X's  
fcntl(F_FULLSYNC), make that an option too.

>>>> why not make it a config option, so that the db admin can choose  
>>>> the durability level in general, and let clients that know they  
>>>> are talking to couch override w/ a header?
>>> Definitely, I think commit options should be settable per- 
>>> database. But for now I was just wanting to address the slowdown,  
>>> especially for replication and the tests, to keep everyone  
>>> productive. More commit features and options is lower priority  
>>> work for now, I was just addresses the most serious slowdown.
>> That makes sense, but IMO you papered over the root problem.
>> It's good to keep people working, but I think the issue deserves a  
>> look.  I don't know erlang, or I would look myself.
> What issue? Why do you think this is Erlang specific?

Oh - this is a SWAG based on one data point  :)  [it was a rough day -  
I didn't get to try to duplicate the results found yesterday...]

It was reported that w/ the same up-to-date version of erlang, they  
found a big performance difference between 0.8 and current trunk.  If  
that's true, then it seems to me that something changed in the  
filesystem handling in the CouchDB code itself - it could be that  
there are multiple flush modes, and the 0.8 code used whatever  
corresponds to fsync(), and trunk uses whatever corresponds to  
fnctl(F_FULLSYNC).  I don't know  It's a guess.  But yesterdays  
results are unexplained, and I hate mysteries.

I can't help with the erlang (I don't know it...), but I can at least  
try to reproduce the results...


View raw message