couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Klein <st.fankl...@gmail.com>
Subject Re: Keep the constant size of database
Date Fri, 17 Jun 2016 10:26:33 GMT
Hi,

2016-06-17 11:26 GMT+02:00 Anatoly Smolyaninov <zarkonesmall@gmail.com>:

> Hello!
>
> I’m gathering lots of data with couchdb for statistics. Map-reduce view
> approch gives the ability to quickly get pre-calculated data, and this is
> very handy since the data consumer software wants the data quickly and very
> often in spite of very high speed of new incoming stat metrics.
>
> But I’m having troubles keeping the database always the same size: I need
> only data for e.g. last 3 hours and all which is elder to be deleted.
>
> I have a special view which emits the timestamp. I query that view with
> startkey-endkey params equal 3 hours ago and then bulk_update this items
> back with _deleted field. But what I'm currently see is that when I do
> delete this way & cleanup after that operation, the number of documents is
> reduced as I expected but size of db has actually increased.
>
> So, after some time of this constant auto clean-up, I see that size of
> database increased very much, but number of documents remains constant.
>
> It would be kind of tricky to create a few databases and delete it
> completely, because data is constantly coming with a high rate. Another
> words, I always need that 3 hour history.
>
> Where can I read how this mechanism actually works and how should I delete
> the data properly? Is it actually possible?
>


You got 2 possibilities and probably want to use both.

First is compaction.
Manual:
http://docs.couchdb.org/en/1.6.1/api/database/compact.html
Automatic:
http://docs.couchdb.org/en/1.6.1/config/compaction.html

This will still leave the deleted documents (not their body) in the DB, so
your db size will still slightly increase over time.

To solve this, you would have to create a new DB, replicate the still
needed data to this new db without replicating the deleted document.
So either a filter function to not replicate the deleted or in your case
even just documents younger than 3 hours.
Be aware, views are not replicated, so the initial view access will take
longer.

regards,
Stefan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message