couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Newson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (COUCHDB-220) Extreme sparseness in couch files
Date Sun, 05 Apr 2009 12:21:12 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695852#action_12695852
] 

Robert Newson commented on COUCHDB-220:
---------------------------------------

// Licensed under the Apache License, Version 2.0 (the "License"); you may not
// use this file except in compliance with the License.  You may obtain a copy
// of the License at
//
//   http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
// WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the
// License for the specific language governing permissions and limitations under
// the License.

couchTests.attachment_sparseness= function(debug) {
  var db = new CouchDB("test_suite_db");
  db.deleteDb();
  db.createDb();
  if (debug) debugger;
  
  for (i = 0; i < 1000; i++) {
      var binAttDoc = {
	  _id: (i).toString(),
	  _attachments:{
	      "foo.txt": {
		  content_type:"text/plain",
		  data: "VGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHRleHQ="
	      }
	  }
      }

      var save_response = db.save(binAttDoc);
      T(save_response.ok);
  }
  
  var before = db.info().disk_size;

  // Compact it.
  T(db.compact().ok);
  T(db.last_req.status == 202);
  // compaction isn't instantaneous, loop until done
  while (db.info().compact_running) {};
  
  var after = db.info().disk_size;

  // Compaction should reduce the database slightly, but not
  // orders of magnitude (unless attachments introduce sparseness)
  T(after > before * 0.1, "database shrunk massively after compaction.");

};

> Extreme sparseness in couch files
> ---------------------------------
>
>                 Key: COUCHDB-220
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-220
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>    Affects Versions: 0.9
>         Environment: ubuntu 8.10 64-bit, ext3
>            Reporter: Robert Newson
>
> When adding ten thousand documents, each with a small attachment, the discrepancy between
reported file size and actual file size becomes huge;
> ls -lh shard0.couch
> 698M 2009-01-23 13:42 shard0.couch
> du -sh shard0.couch
> 57M	shard0.couch
> On filesystems that do not support write holes, this will cause an order of magnitude
more I/O.
> I think it was introduced by the streaming attachment patch as each attachment is followed
by huge swathes of zeroes when viewed with 'hd -v'.
> Compacting this database reduced it to 7.8mb, indicating other sparseness besides attachments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message