incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pascal Borghino <pborgh...@yahoo.fr>
Subject Re: What am I doing wrong?
Date Thu, 26 Feb 2009 13:13:44 GMT
Hi guys,
just to keep you updated. I ran out of space last time while trying to 
compress my 83Go file...
I re run the test with half the amount of docs, 2.3M

-rw-r--r--  1 root root  24G Feb 26 01:45 test.couch

{"db_name":"test","doc_count":2219598,"doc_del_count":0,"update_seq":2219598,"purge_seq":0,"compact_running":false,"disk_size":25017692071,"instance_start_time":"1235590552047908"}

curl -X POST http://localhost:5984/test/_compact

-rw-r--r--  1 root root  17G Feb 26 13:00 test.couch

it took 3 hours to do the compression.... but we won 7Go back about 
30%... quite big
P.


> Also, 0.8.1 compaction has a hard time compacting big dbs. Trunk is 
> better.
>
> -Damien
>
>
> On Feb 20, 2009, at 12:04 PM, Jan Lehnardt wrote:
>
>>
>> On 20 Feb 2009, at 17:42, Pascal Borghino wrote:
>>
>>> Hi there, I do not have attachments...
>>>
>>> $ ls -lh
>>> -rw-r--r--  1 root root  83G Feb 20 02:40 test.couch
>>> -rw-r--r--  1 root root  23G Feb 20 16:33 test.couch.compact
>>>
>>> $ du -sh
>>> 107G    .
>>>
>>> still... from 19Go to 83Go... huge difference.
>>> P.
>>
>> The fact that there is a .compact file means that compaction
>> is still running (or was aborted). When you restart it, you
>> should see it in the "Status" section of Futon and how far
>> along it is. Compaction will continue where it left off. Please
>> let us know what the final database file size is when compaction
>> is finished.
>>
>> If you did an insertion of a lot of single documents, quite
>> extensive sparseness can occur. On large imports, do
>> use bulk inserts (see the wiki) or if that is not possible,
>> compact every once in a while during the import.
>>
>> Cheers
>> Jan
>> -- 
>>
>>
>>>
>>>
>>>
>>> Robert Newson a écrit :
>>>> I expect the b-tree wastage is minimal (though not zero).
>>>>
>>>> I've wondered what happens on filesystems that don't support sparse
>>>> files, I assume they'd just be slower and use more disk space. Given
>>>> that the holes vanish after compaction, I suspected a bad calculation
>>>> in the code (couch_db.erl, I think), but I've not found it, it seems
>>>> to do the right thing. HFS+ doesn't support holes but I'm pretty sure
>>>> NTFS does.
>>>>
>>>> Btw, it's mostly around attachments. If you add lots of documents but
>>>> no attachments, ls and df are in close agreement.
>>>>
>>>> B.
>>>>
>>>> On Fri, Feb 20, 2009 at 4:00 PM, Jens Alfke <jens@mooseyard.com> 
>>>> wrote:
>>>>
>>>>> On Feb 20, 2009, at 6:03 AM, Pascal Borghino wrote:
>>>>>
>>>>>
>>>>>> I am currently compacting it... even  if 'Compaction rewrites the

>>>>>> database
>>>>>> file, removing outdated document revisions and deleted 
>>>>>> documents'... no
>>>>>> document should be outdate neither deleted...
>>>>>>
>>>>> In addition to the sparseness of the file, another reason for the 
>>>>> size
>>>>> difference might be obsolete b-tree nodes. The file is 
>>>>> append-only, so any
>>>>> time a b-tree changes, the old nodes remain in the file. If you've 
>>>>> done a
>>>>> large number of individual insertions, that space might be 
>>>>> significant.
>>>>> (Probably not gigabytes, though.)
>>>>>
>>>>>
>>>>> robert.newson@gmail.com wrote:
>>>>>
>>>>>
>>>>>> I find the actual
>>>>>> consumed space is far, far less that 'ls' shows. CouchDB .couch 
>>>>>> files
>>>>>> are very sparse, large gaps of unwritten data, ostensibly to keep
>>>>>> btree and document items separate, but these 'holes' vanish after
>>>>>> compaction, even if you have zero updates and deletes.
>>>>>>
>>>>> Hm. But not all filesystems support sparse files. HFS+, the Mac OS
>>>>> filesystem, doesn't. (Does NTFS?) Is there an option to suppress 
>>>>> the gaps?
>>>>>
>>>>> —Jens
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>
>



Mime
View raw message