incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Hahn <m...@hahnca.com>
Subject Re: Faster one-time replication doing file-system copy?
Date Tue, 18 Mar 2014 15:58:35 GMT
I copy live DBs all the time.  Just make sure you don't copy anything along
with the actual .db file.  Let the views rebuild.


On Tue, Mar 18, 2014 at 8:10 AM, Daniel Gonzalez <gonvaled@gonvaled.com>wrote:

> On Tue, Mar 18, 2014 at 1:48 PM, Stefan Klein <st.fankl.in@gmail.com>
> wrote:
>
> > Hi,
> >
> > from my understanding which might be wrong
> >
> > 2014-03-18 13:31 GMT+01:00 Daniel Gonzalez <gonvaled@gonvaled.com>:
> >
> > > Thanks Dave,
> > >
> > > Yes, I understand that I need to restart it. Actually, I assume that
> both
> > > source and destination must be stopped while copying:
> > > - source to avoid data changes while copying, which can lead to
> > > inconsistent data being copied
> > >
> >
> > A couchdb datafile is never inconsistent.
> > see https://wiki.apache.org/couchdb/How_to_make_filesystem_backups
>
>
> Mmm. But we are copying several databases / design_docs / view files, any
> of which can be changing while the long copy operation is running. It just
> seems easier to minimize any inconsistency risk to just stop the source.
> Even if a file is always consistent, how to guarantee that the different
> files are consistent with each other? I mean:
> 1) I copy a database
> 2) Just after the copy is finished, a new doc is appended
> 3) That doc is processed in one of the views
> 4) Now it is the turn to copy the view file
>
> Suddenly, the destination database has one view with one indexed document
> which is *not* in the database file. Maybe couchdb is going to get confused
> about it when I start it?
>
>  > - destination to avoid that two processes (cp and couchdb) are stepping
> > on
> > > each writing on the same files.
> > >
> >
> > That's not how filehandles work (at least on Linux)
> > open 4 terminals
> > terminal 1:
> > cd /tmp
> > cat > file
> > terminal 2:
> > tail -f /tmp/file
> > terminal 1:
> > enter some text
> > terminal 3:
> > rm /tmp/file
> > cat > /tmp/file
> > terminal 4:
> > tail -f /tmp/file
> > terminal 1:
> > enter some text
> > terminal 3:
> > enter some text
> >
> >
> > as long the filehandle is not closed and re-opened it reads and writes to
> > the "old" file, although the "old" file doesn't have a directory entry
> > anymore.
> >
> > So it should be fine to just copy over the file and post to _restart
> after
> > the copy finished.
> >
>
> Fair enough, but that assumes that couchdb is not reopening files from the
> filesystem. How can I be sure of that, without being *intimate* familiar
> with the couchdb sources? Or is there any guarantee on the documentation
> that no file reopening is being performed? What about database /
> design_docs delete / create operations? It just seems *very* difficult to
> guarantee that, under any scenario, couchdb is not going to open a file
> from the filesystem, one which has just been created by the copy operation,
> and which makes the whole system behave badly.
>
> Stopping the destination instance (and thus not being able to use it, not
> even by mistake), seems like an easy way to guarantee that no problems are
> going to occur. We are talking about a test instance anyway, so downtime is
> not an issue.
>
>
> > regards,
> > Stefan
> >
>
> Thanks!
> Daniel
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message