On Tue, Jan 20, 2009 at 5:07 PM, Emmanuel Lecharny <elecharny@gmail.com> wrote:

What about performances ? They are obviously terrible... If you think about a server A on which you have injected 1 million entries, trying to replicate with a server B on which a unique entry has been modified. We will have to :
- revert the million entries on A
- revert the modification on B
- transmit the million entry from A to B
- transmit the entry from B to A
- order the million and 1 entry on both server
- inject one million and one entry on A and B.

Obviously, we will have to transmit the million entry to B and to apply them, but why should we revert and re-apply them on A? This is just a waste...

How can we improve this ?

One idea might be to send batches of replication events instead of one event at a time after the servers connect.

1) some entry are added on A and on B.
If any of those entries are 'colliding' (ie the intersection is not empty), we have to manage some conflict. We have two different cases :
- both entries are the same : that's just ok, we just keep the older one.

You should keep the newer one I think because the second add would have failed and you want the timestamps and creator information to match the first add.