Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9B28F81E7 for ; Tue, 16 Aug 2011 09:21:02 +0000 (UTC) Received: (qmail 45558 invoked by uid 500); 16 Aug 2011 09:20:58 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 45064 invoked by uid 500); 16 Aug 2011 09:20:53 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 45038 invoked by uid 99); 16 Aug 2011 09:20:50 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Aug 2011 09:20:50 +0000 X-ASF-Spam-Status: No, hits=-2001.1 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Aug 2011 09:20:47 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 6F8B5BE26E for ; Tue, 16 Aug 2011 09:20:27 +0000 (UTC) Date: Tue, 16 Aug 2011 09:20:27 +0000 (UTC) From: "Benoit Chesneau (JIRA)" To: dev@couchdb.apache.org Message-ID: <273378179.40882.1313486427453.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <695819657.34335.1305020883141.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (COUCHDB-1153) Database and view index compaction daemon MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/COUCHDB-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085603#comment-13085603 ] Benoit Chesneau commented on COUCHDB-1153: ------------------------------------------ about the _all_dbs scanning, maybe we could have a database maintaing created dbs like cloudant do. Or Elasticsearch for that purpose. Rather than scanning _all_dbs it oculd react on _changes ? > Database and view index compaction daemon > ----------------------------------------- > > Key: COUCHDB-1153 > URL: https://issues.apache.org/jira/browse/COUCHDB-1153 > Project: CouchDB > Issue Type: New Feature > Environment: trunk > Reporter: Filipe Manana > Assignee: Filipe Manana > Priority: Minor > Labels: compaction > > I've recently written an Erlang process to automatically compact databases and they're views based on some configurable parameters. These parameters can be global or per database and are: minimum database fragmentation, minimum view fragmentation, allowed period and "strict_window" (whether an ongoing compaction should be canceled if it doesn't finish within the allowed period). These fragmentation values are based on the recently added "data_size" parameter to the database and view group information URIs (COUCHDB-1132). > I've documented the .ini configuration, as a comment in default.ini, which I paste here: > [compaction_daemon] > ; The delay, in seconds, between each check for which database and view indexes > ; need to be compacted. > check_interval = 60 > ; If a database or view index file is smaller then this value (in bytes), > ; compaction will not happen. Very small files always have a very high > ; fragmentation therefore it's not worth to compact them. > min_file_size = 131072 > [compactions] > ; List of compaction rules for the compaction daemon. > ; The daemon compacts databases and they're respective view groups when all the > ; condition parameters are satisfied. Configuration can be per database or > ; global, and it has the following format: > ; > ; database_name = parameter=value [, parameter=value]* > ; _default = parameter=value [, parameter=value]* > ; > ; Possible parameters: > ; > ; * db_fragmentation - If the ratio (as an integer percentage), of the amount > ; of old data (and its supporting metadata) over the database > ; file size is equal to or greater then this value, this > ; database compaction condition is satisfied. > ; This value is computed as: > ; > ; (file_size - data_size) / file_size * 100 > ; > ; The data_size and file_size values can be obtained when > ; querying a database's information URI (GET /dbname/). > ; > ; * view_fragmentation - If the ratio (as an integer percentage), of the amount > ; of old data (and its supporting metadata) over the view > ; index (view group) file size is equal to or greater then > ; this value, then this view index compaction condition is > ; satisfied. This value is computed as: > ; > ; (file_size - data_size) / file_size * 100 > ; > ; The data_size and file_size values can be obtained when > ; querying a view group's information URI > ; (GET /dbname/_design/groupname/_info). > ; > ; * period - The period for which a database (and its view groups) compaction > ; is allowed. This value must obey the following format: > ; > ; HH:MM - HH:MM (HH in [0..23], MM in [0..59]) > ; > ; * strict_window - If a compaction is still running after the end of the allowed > ; period, it will be canceled if this parameter is set to "yes". > ; It defaults to "no" and it's meaningful only if the *period* > ; parameter is also specified. > ; > ; * parallel_view_compaction - If set to "yes", the database and its views are > ; compacted in parallel. This is only useful on > ; certain setups, like for example when the database > ; and view index directories point to different > ; disks. It defaults to "no". > ; > ; Before a compaction is triggered, an estimation of how much free disk space is > ; needed is computed. This estimation corresponds to 2 times the data size of > ; the database or view index. When there's not enough free disk space to compact > ; a particular database or view index, a warning message is logged. > ; > ; Examples: > ; > ; 1) foo = db_fragmentation = 70%, view_fragmentation = 60% > ; The `foo` database is compacted if its fragmentation is 70% or more. > ; Any view index of this database is compacted only if its fragmentation > ; is 60% or more. > ; > ; 2) foo = db_fragmentation = 70%, view_fragmentation = 60%, period = 00:00-04:00 > ; Similar to the preceding example but a compaction (database or view index) > ; is only triggered if the current time is between midnight and 4 AM. > ; > ; 3) foo = db_fragmentation = 70%, view_fragmentation = 60%, period = 00:00-04:00, strict_window = yes > ; Similar to the preceding example - a compaction (database or view index) > ; is only triggered if the current time is between midnight and 4 AM. If at > ; 4 AM the database or one of its views is still compacting, the compaction > ; process will be canceled. > ; > ;_default = db_fragmentation = 70%, view_fragmentation = 60%, period = 23:00 - 04:00 > (from https://github.com/fdmanana/couchdb/compare/compaction_daemon#L0R195) > The full patch is mostly a new module but also does some minimal changes and a small refactoring to the view compaction code, not changing the current behaviour. > Patch is at: > https://github.com/fdmanana/couchdb/compare/compaction_daemon.patch > By default the daemon is idle, without any configuration enabled. I'm open to suggestions on additional parameters and a better configuration system. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira