couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Stevens (JIRA)" <j...@apache.org>
Subject [jira] [Created] (COUCHDB-1192) Attachment upload speed varies widely based on how it is uploaded
Date Wed, 08 Jun 2011 20:10:01 GMT
Attachment upload speed varies widely based on how it is uploaded
-----------------------------------------------------------------

                 Key: COUCHDB-1192
                 URL: https://issues.apache.org/jira/browse/COUCHDB-1192
             Project: CouchDB
          Issue Type: Question
          Components: HTTP Interface
    Affects Versions: 1.0.2
         Environment: OSX 10.6.7 MacBook Pro (7200 RPM disk)
CouchDBX 1.0.2
couchdb-python used as client code
            Reporter: Eli Stevens
            Priority: Minor


Running the following code on a macbook pro, using CouchDBX 1.0.2 (everything local), we're
seeing the following output when trying to attach a file with 10MB of random data:

Code: https://gist.github.com/bc0c36f36be0c85e2a36
Output:

Using put_attachment: 0.309157133102
post time: 2.5557808876
Using multipart: 2.61283898354
Encoding base64: 0.0497629642487
Updating: 5.0550069809

Server log: https://gist.github.com/a80a495fd35049ff871f (there's a HEAD/DELETE/PUT/GET cycle
that's just cleanup)

The calls in question are:

Using put_attachment: 0.309157133102
1> [info] [<0.27809.7>] 127.0.0.1 - - 'PUT' /benchmark_entity/bigfile/smallfile?rev=81-c538b38a8463952f0136143cfa49e9fa
201

Using multipart: 2.61283898354 (post time: 2.5557808876) 
1> [info] [<0.27809.7>] 127.0.0.1 - - 'POST' /benchmark_entity/bigfile 201

Updating: 5.0550069809
1> [info] [<0.27809.7>] 127.0.0.1 - - 'POST' /benchmark_entity/_bulk_docs 201

Profiling our code shows 1.5 sec of CPU usage in our code (which covers setup / cleanup code
that's not included in the times above), and 11.8 sec of total run time, which roughly matches
up with the PUT/POST times above.  Basically, I feel pretty confident that the bulk of the
times above are not in our client code, and are instead due to couchdb's handling time.  We
haven't conclusively ruled out couchdb-python behaving very oddly, though it seems very unlikely.

Why is the form/multipart handler so much slower than using a bare PUT on the attachment?
 Why is the base64 approach even slower?  Is it due to bandwidth issues, couchdb CPU usage...?
 If needed, we can update to 1.1 and test there.

Note that the curl code doesn't seem to result in the same MD5 when we get the attachment
back out, so I've snipped the output related to that.

Thanks for any help,
Eli

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message