couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Joseph Davis (JIRA)" <j...@apache.org>
Subject [jira] Commented: (COUCHDB-220) Extreme sparseness in couch files
Date Wed, 08 Apr 2009 04:53:13 GMT

    [ https://issues.apache.org/jira/browse/COUCHDB-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696888#action_12696888
] 

Paul Joseph Davis commented on COUCHDB-220:
-------------------------------------------

Chris,

First, I'm pretty certain that this bug is only affecting document writes that include an
attachment. 

You should check if your erlang loader is getting the proper attachment information all the
way down into couch_db:doc_flush_binaries. My first haphazard guess is that its not. My second
random guess is you could be seeing the same bug from a different code path. Also, there's
another slight tweak to the patch to only go to the 65K allocation when there's a binary of
unknown size.

Either way, I'm fairly certain that while changing the min_alloc to a single byte shows that
there is a bug, its not the proper fix for the bug.



> Extreme sparseness in couch files
> ---------------------------------
>
>                 Key: COUCHDB-220
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-220
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>    Affects Versions: 0.9
>         Environment: ubuntu 8.10 64-bit, ext3
>            Reporter: Robert Newson
>         Attachments: 220.patch, 220.patch, attachment_sparseness.js, stream.diff
>
>
> When adding ten thousand documents, each with a small attachment, the discrepancy between
reported file size and actual file size becomes huge;
> ls -lh shard0.couch
> 698M 2009-01-23 13:42 shard0.couch
> du -sh shard0.couch
> 57M	shard0.couch
> On filesystems that do not support write holes, this will cause an order of magnitude
more I/O.
> I think it was introduced by the streaming attachment patch as each attachment is followed
by huge swathes of zeroes when viewed with 'hd -v'.
> Compacting this database reduced it to 7.8mb, indicating other sparseness besides attachments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message