couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tillmann Seidel <tsei...@eclipsesource.com>
Subject Re: Data corruption problem with attachments
Date Mon, 12 Sep 2011 09:47:56 GMT
Finally I got access to the CouchDB logs and it seems this is another bad_utf8_character_code
problem.

My document looks like this:

{"_id":"72de8a27-c3d1-3626-b6ff-190925a990e4",
 "_rev":"1-64d806a111cbf19740f76e78bdae097b",
 "_attachments":{"ea_seite03.png":{"content_type":"image/png","revpos":1,"length":84057,"stub":true},
 "ea_seite01.png":{"content_type":"image/png","revpos":1,"length":141866,"stub":true},
 "ea_seite02.png":{"content_type":"image/png","revpos":1,"length":30189,"stub":true},
 "content.xml":{"content_type":"application/xml","revpos":1,"length":1882,"stub":true}}
}


When trying to access the first attachment, an HTTP code 200 is returned, but without content.
Additionally the log displays an UTF-8 encoding problem:


[Mon, 12 Sep 2011 08:34:16 GMT] [info] [<0.25582.1>] 192.168.132.25 - - 'GET' /updateserver/72de8a27-c3d1-3626-b6ff-190925a990e4/ea_seite03.png
200
[Mon, 12 Sep 2011 08:34:16 GMT] [error] [<0.25582.1>] Uncaught error in HTTP request:
{error,
                                 {badmatch,
                                  <<60,134,195,221,170,229,7,35,56,121,75,
                                    219,13,218,29,250>>}}
[Mon, 12 Sep 2011 08:34:16 GMT] [info] [<0.25582.1>] Stacktrace: [{couch_stream,foldl,6},
             {couch_util,md5_final,1},
             {couch_httpd_db,do_db_req,2},
             {couch_httpd,handle_request_int,5},
             {mochiweb_http,headers,5},
             {proc_lib,init_p_do_apply,3}]
[Mon, 12 Sep 2011 08:34:16 GMT] [error] [<0.25582.1>] {error_report,<0.33.0>,
    {<0.25582.1>,crash_report,
     [[{initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
       {pid,<0.25582.1>},
       {registered_name,[]},
       {error_info,
           {exit,
               {ucs,{bad_utf8_character_code}},
               [{xmerl_ucs,from_utf8,1},
                {mochijson2,json_encode_string,2},
                {mochijson2,'-json_encode_proplist/2-fun-0-',3},
                {lists,foldl,3},
                {mochijson2,json_encode_proplist,2},
                {couch_httpd,send_json,4},
                {couch_httpd,handle_request_int,5},
                {mochiweb_http,headers,5}]}},
       {ancestors,
           [couch_httpd,couch_secondary_services,couch_server_sup,<0.34.0>]},
       {messages,[]},
       {links,[<0.105.0>,#Port<0.4839>]},
       {dictionary,
           [{mochiweb_request_qs,[]},
            {jsonp,no_jsonp},
            {mochiweb_request_cookie,[]}]},
       {trap_exit,false},
       {status,running},
       {heap_size,4181},
       {stack_size,24},
       {reductions,5429}],
      []]}}
[Mon, 12 Sep 2011 08:34:16 GMT] [error] [<0.105.0>] {error_report,<0.33.0>,
    {<0.105.0>,std_error,
     {mochiweb_socket_server,235,
         {child_error,{ucs,{bad_utf8_character_code}}}}}}



It does not seem to be one of the known issues with UTF-8 encoding. I'm accessing an attachment
with content type image/png (so encoding should not matter here) with an URL that seems to
have no problematic characters.
Any ideas?

This problem happened with CouchDB 1.0.2 on Windows.

Thanks,
Tillmann




On Sep 7, 2011, at 4:57 PM, Robert Newson wrote:

> If you supply a Content-MD5 header in your request we will verify it
> (and reject a mismatch) just like Amazon S3 does. That doesn't imply
> that couchdb routinely corrupts attachments (it doesn't).
> 
> Can you paste a full request/response where you regard the result as
> truncated or corrupted? What client software are you using? Can you
> reproduce this with curl?
> 
> B.
> 
> On 7 September 2011 14:53, Tillmann Seidel <tseidel@eclipsesource.com> wrote:
>> Hi,
>> 
>> I have a problem with data corruption on CouchDB. I'm creating documents with attachments
using PUT requests in CouchDB 1.0.2 . Once in a while it happens that a stored document is
corrupt, i.e. an attachment is truncated or has no data at all.  CouchDB does not return an
error though when the document is created.
>> 
>> The description of COUCHDB-558 makes me think that this is a problem that's not unheard
of:
>> 
>> "We could detect in-flight data corruption if a client sends a Content-MD5 header
along with the data and Couch validates the MD5 on arrival."
>> 
>> Now my question is: what might cause such an in-flight data corruption? And what
could I do to prevent it? Or if I cannot prevent it, can I at least make CouchDB detect it
during creation?
>> 
>> Thanks in advance
>> Tillmann
>> 

-----------------------------------
Tillmann Seidel
Innoopract Informationssysteme GmbH
Email: tseidel@innoopract.com
Tel: +49-721-66-47-33-0
Fax: +49-721-66-47-33-29
http://www.innoopract.com

Innoopract Informationssysteme GmbH
Lammstr. 21, 76133 Karlsruhe, Germany
General Manager: Jochen Krause 
Registered Office: Karlsruhe, Commercial Register Mannheim HRB 107883



Mime
View raw message