incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From muji <mis...@freeformsystems.com>
Subject Tracking file throughput?
Date Fri, 03 Jun 2011 13:43:55 GMT
Hi,

I'm still new to couchdb and nosql so apologies if the answer to this
is trivial.

I'm trying to track the throughput of a file sent via a POST request
in a couchdb document.

My initial implementation creates a document for the file before the
POST is sent and then I have an update handler that increments the
"uploadbytes" for every chunk of data received from the client.

This *nearly* works except that I get document update conflicts (which
I think is to do with me not being able to throttle back the upload
while the db is updated) but the main problem is that for large files
(~2.4GB) the number of document revisions is around 40-50,000. So I
have a single document taking up between 0.7GB and 1GB. After
compaction if reduces to ~380KB which of course is much better but
this still seems excessive and poses problems with compacting to a
write heavy database. I understand the trick to that is to replicate,
compact and replicate back to the source, please correct me if I'm
wrong...

So, I don't think this approach is viable which makes me wonder
whether setting the _revs_limit will help, although I understand that
setting this per database still requires compaction and will save on
space after compaction.

I was thinking that tracking the throughput as chunks in individual
documents and then calculating the throughput with a map/reduce on all
the chunks might be a better approach. Although I'm concerned that
having lots of little documents for each data chunk will also take up
large amounts of space...

Any advice and guidance on the best way to tackle this would be much
appreciated.

-- 
muji.

Mime
View raw message