couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bogdan Andu <bog...@gmail.com>
Subject Re: CouchDB: 2.0 & 1.6.1 database compatibility
Date Mon, 10 Oct 2016 09:54:52 GMT
Hi,

I return with updated info :

I compacted db1 (CouchDB/1.6.1) on the source and now has 350 MB from 2.5 GB
with 362849 no. of documents
I also compacted the views but no difference in size .

The database stores documents of the following form:

{
   "_id": "00006df04672a0c0e0da142ad8cd90b9",
   "_rev": "1-a14afd34d5a52e3f6ae515c9adcff2d3",
   "local_id": "110361",
   "email": "schwarzer-tee@tee.schwarz",
   "sent_date": "2007-06-29 12:20:31",
   "regtype": "n"
}

Huge difference between 2.5GB and 350 MB and the
documents had no revisions.

If Couch is able to reduce a db's size to this magnitude after compaction
why cannot maintain the aprox. the same size limit during
normal operations(there are no deletions, no updates , only insertions).

Maybe the b-tree is optimized only after compaction, and not during
repetitive insertions

(aprox. 2000 insertions/day).

and for the sake of consistency..

after replication to 2.0 couchdb, the same database
(with views generated took ~ 20 minutes / 362849 docs), we have:

69.3 MB / 362849 documents

Now the big surprise is the huge difference in
size resulted after compaction on 1.6.1


to summarize :

(1) 1.6.1     original             2.5 GB      362849 docs

(2) 1.6.1     compacted            350 MB      362849 docs

(3) 2.0       replicate (from (1)) 69.3 MB     362849 docs


/Bogdan






On Fri, Oct 7, 2016 at 4:43 PM, Adam Kocoloski <kocolosk@apache.org> wrote:

> Lots of good questions there.
>
> On the storage size, note that even when you write only one revision of
> each document the database will accumulate some wasted space. Inserts to
> the database cause internal btree structures to be updated, and due to the
> copy-on-write nature of the storage engine the old btree nodes are left
> around in the file.
>
> We did make some changes in the compaction system that produce smaller
> files at the end of the day. You can read more about those changes here -
> https://blog.couchdb.org/2016/08/10/feature-compaction/ <
> https://blog.couchdb.org/2016/08/10/feature-compaction/> - but they don’t
> explain the difference that you reported. Perhaps you didn’t compact the
> source database at all?
>
> You are correct that both design documents and mango will build
> btree-based indexes to answer their queries. I would like to see us add
> functionality to mango over time so that it can cover the large majority of
> use cases where folks need to appeal to views in design documents, but
> we’re not quite there yet. One example where mango cannot help you today is
> reduce functions; if you want to aggregate the values in your index you
> need to drop down and build a view for that.
>
> In terms of performance, mango should be moderately faster at building an
> index because there’s no JavaScript roundtrip. Querying performance should
> be ~identical. Cheers,
>
> Adam
>
> > On Oct 7, 2016, at 7:56 AM, Thanos Vassilakis <thanosv@gmail.com> wrote:
> >
> > Good questions
> >
> > Sent from my iPhone
> >
> >> On Oct 7, 2016, at 5:29 AM, Bogdan Andu <bog495@gmail.com> wrote:
> >>
> >> I see the data management is totally different(and better).
> >> now there is a _dbs.couch for a registry-like database for databases
> >> and actual databases are located in data/shards subdirectories.
> >>
> >> so.. only replication works here..
> >> and one can replicate many databases in parallel.
> >>
> >> another difference I see is the size of databases.
> >>
> >> 2.0 version keep a very small size of databases compared to 1.6.1
> version.
> >>
> >> Is there any change in storage engine that makes so big differences in
> >> database sizes?
> >>
> >> all records in db1 in 1.6.1 have only one revision like (1-...) format
> >>
> >> db1 in 1.6.1 is 2.5GB with 362849 records
> >> after replication:
> >> db1 in 2.0 has 69.3 MB with 362849 records
> >>
> >> when is recommended to use design documents and when mango queries.
> >> is mango intended to replace design documents although I assume both
> >> build a view tree for the query in question.
> >>
> >> which one is faster?
> >> what are the use-cases for each one of the query methods?
> >>
> >> Thanks,
> >>
> >> Bogdan
> >>
> >>
> >>
> >>> On Fri, Oct 7, 2016 at 11:20 AM, max <maxima078@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Install 2.0 version on another server or just make it listen on
> different
> >>> port than 1.6 then replicate your data ;)
> >>>
> >>> 2016-10-07 9:49 GMT+02:00 Bogdan Andu <bog495@gmail.com>:
> >>>
> >>>> Hello,
> >>>>
> >>>> I configured a single-node CouchDB 2.0 instance and
> >>>> I copied in data directory 1.6.1 couch databases.
> >>>>
> >>>> But the databases does not show up in Fauxton, only the
> >>>> test databases:
> >>>>
> >>>> ["_global_changes","_replicator","_users","verifytestdb"].
> >>>>
> >>>> Is there a way to make CouchDB 2.0 read 1.6.1 couch files
> >>>>
> >>>> without importing?
> >>>>
> >>>> /Bogdan
> >>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message