couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Gonzalez <gonva...@gonvaled.com>
Subject Re: Faster one-time replication doing file-system copy?
Date Tue, 18 Mar 2014 15:10:19 GMT
On Tue, Mar 18, 2014 at 1:48 PM, Stefan Klein <st.fankl.in@gmail.com> wrote:

> Hi,
>
> from my understanding which might be wrong
>
> 2014-03-18 13:31 GMT+01:00 Daniel Gonzalez <gonvaled@gonvaled.com>:
>
> > Thanks Dave,
> >
> > Yes, I understand that I need to restart it. Actually, I assume that both
> > source and destination must be stopped while copying:
> > - source to avoid data changes while copying, which can lead to
> > inconsistent data being copied
> >
>
> A couchdb datafile is never inconsistent.
> see https://wiki.apache.org/couchdb/How_to_make_filesystem_backups


Mmm. But we are copying several databases / design_docs / view files, any
of which can be changing while the long copy operation is running. It just
seems easier to minimize any inconsistency risk to just stop the source.
Even if a file is always consistent, how to guarantee that the different
files are consistent with each other? I mean:
1) I copy a database
2) Just after the copy is finished, a new doc is appended
3) That doc is processed in one of the views
4) Now it is the turn to copy the view file

Suddenly, the destination database has one view with one indexed document
which is *not* in the database file. Maybe couchdb is going to get confused
about it when I start it?

 > - destination to avoid that two processes (cp and couchdb) are stepping
> on
> > each writing on the same files.
> >
>
> That's not how filehandles work (at least on Linux)
> open 4 terminals
> terminal 1:
> cd /tmp
> cat > file
> terminal 2:
> tail -f /tmp/file
> terminal 1:
> enter some text
> terminal 3:
> rm /tmp/file
> cat > /tmp/file
> terminal 4:
> tail -f /tmp/file
> terminal 1:
> enter some text
> terminal 3:
> enter some text
>
>
> as long the filehandle is not closed and re-opened it reads and writes to
> the "old" file, although the "old" file doesn't have a directory entry
> anymore.
>
> So it should be fine to just copy over the file and post to _restart after
> the copy finished.
>

Fair enough, but that assumes that couchdb is not reopening files from the
filesystem. How can I be sure of that, without being *intimate* familiar
with the couchdb sources? Or is there any guarantee on the documentation
that no file reopening is being performed? What about database /
design_docs delete / create operations? It just seems *very* difficult to
guarantee that, under any scenario, couchdb is not going to open a file
from the filesystem, one which has just been created by the copy operation,
and which makes the whole system behave badly.

Stopping the destination instance (and thus not being able to use it, not
even by mistake), seems like an easy way to guarantee that no problems are
going to occur. We are talking about a test instance anyway, so downtime is
not an issue.


> regards,
> Stefan
>

Thanks!
Daniel

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message