Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 47E20759B for ; Mon, 12 Sep 2011 09:49:17 +0000 (UTC) Received: (qmail 24152 invoked by uid 500); 12 Sep 2011 09:49:10 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 22854 invoked by uid 500); 12 Sep 2011 09:48:43 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 22800 invoked by uid 99); 12 Sep 2011 09:48:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Sep 2011 09:48:28 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of tseidel@eclipsesource.com designates 209.85.161.52 as permitted sender) Received: from [209.85.161.52] (HELO mail-fx0-f52.google.com) (209.85.161.52) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Sep 2011 09:48:21 +0000 Received: by fxe23 with SMTP id 23so1552339fxe.11 for ; Mon, 12 Sep 2011 02:47:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=eclipsesource.com; s=eclipsesource; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; bh=2yDPnk2mfMfIEhYxiVhzgE4vFZz7oH9j2gQhdcgDiCY=; b=KyJQBexyDImSTbicZBe3f/I7vEFjs5a8xD6fUke+N949Ws99bnhBqlSvnnBv95l2Ns eEGt6u/KfjrOfko/KxvKIzAJ/mvWsPw6jgZlF5WoHM09MaqN5Dk+KfL+Ve8tEztDT1Tg 9lavAKS6lQmDdA6/oshkQndCUyqq4UQraJ0RE= Received: by 10.223.48.214 with SMTP id s22mr252699faf.104.1315820879446; Mon, 12 Sep 2011 02:47:59 -0700 (PDT) Received: from [192.168.6.192] (p54A39C34.dip0.t-ipconnect.de [84.163.156.52]) by mx.google.com with ESMTPS id w6sm50527fah.0.2011.09.12.02.47.57 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 12 Sep 2011 02:47:58 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1084) Subject: Re: Data corruption problem with attachments From: Tillmann Seidel In-Reply-To: Date: Mon, 12 Sep 2011 11:47:56 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: user@couchdb.apache.org X-Mailer: Apple Mail (2.1084) Finally I got access to the CouchDB logs and it seems this is another = bad_utf8_character_code problem. My document looks like this: {"_id":"72de8a27-c3d1-3626-b6ff-190925a990e4", "_rev":"1-64d806a111cbf19740f76e78bdae097b", = "_attachments":{"ea_seite03.png":{"content_type":"image/png","revpos":1,"l= ength":84057,"stub":true}, = "ea_seite01.png":{"content_type":"image/png","revpos":1,"length":141866,"s= tub":true}, = "ea_seite02.png":{"content_type":"image/png","revpos":1,"length":30189,"st= ub":true}, = "content.xml":{"content_type":"application/xml","revpos":1,"length":1882,"= stub":true}} } When trying to access the first attachment, an HTTP code 200 is = returned, but without content. Additionally the log displays an UTF-8 = encoding problem: [Mon, 12 Sep 2011 08:34:16 GMT] [info] [<0.25582.1>] 192.168.132.25 - - = 'GET' /updateserver/72de8a27-c3d1-3626-b6ff-190925a990e4/ea_seite03.png = 200 [Mon, 12 Sep 2011 08:34:16 GMT] [error] [<0.25582.1>] Uncaught error in = HTTP request: {error, {badmatch, = <<60,134,195,221,170,229,7,35,56,121,75, 219,13,218,29,250>>}} [Mon, 12 Sep 2011 08:34:16 GMT] [info] [<0.25582.1>] Stacktrace: = [{couch_stream,foldl,6}, {couch_util,md5_final,1}, {couch_httpd_db,do_db_req,2}, {couch_httpd,handle_request_int,5}, {mochiweb_http,headers,5}, {proc_lib,init_p_do_apply,3}] [Mon, 12 Sep 2011 08:34:16 GMT] [error] [<0.25582.1>] = {error_report,<0.33.0>, {<0.25582.1>,crash_report, = [[{initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}}, {pid,<0.25582.1>}, {registered_name,[]}, {error_info, {exit, {ucs,{bad_utf8_character_code}}, [{xmerl_ucs,from_utf8,1}, {mochijson2,json_encode_string,2}, {mochijson2,'-json_encode_proplist/2-fun-0-',3}, {lists,foldl,3}, {mochijson2,json_encode_proplist,2}, {couch_httpd,send_json,4}, {couch_httpd,handle_request_int,5}, {mochiweb_http,headers,5}]}}, {ancestors, = [couch_httpd,couch_secondary_services,couch_server_sup,<0.34.0>]}, {messages,[]}, {links,[<0.105.0>,#Port<0.4839>]}, {dictionary, [{mochiweb_request_qs,[]}, {jsonp,no_jsonp}, {mochiweb_request_cookie,[]}]}, {trap_exit,false}, {status,running}, {heap_size,4181}, {stack_size,24}, {reductions,5429}], []]}} [Mon, 12 Sep 2011 08:34:16 GMT] [error] [<0.105.0>] = {error_report,<0.33.0>, {<0.105.0>,std_error, {mochiweb_socket_server,235, {child_error,{ucs,{bad_utf8_character_code}}}}}} It does not seem to be one of the known issues with UTF-8 encoding. I'm = accessing an attachment with content type image/png (so encoding should = not matter here) with an URL that seems to have no problematic = characters. Any ideas? This problem happened with CouchDB 1.0.2 on Windows. Thanks, Tillmann On Sep 7, 2011, at 4:57 PM, Robert Newson wrote: > If you supply a Content-MD5 header in your request we will verify it > (and reject a mismatch) just like Amazon S3 does. That doesn't imply > that couchdb routinely corrupts attachments (it doesn't). >=20 > Can you paste a full request/response where you regard the result as > truncated or corrupted? What client software are you using? Can you > reproduce this with curl? >=20 > B. >=20 > On 7 September 2011 14:53, Tillmann Seidel = wrote: >> Hi, >>=20 >> I have a problem with data corruption on CouchDB. I'm creating = documents with attachments using PUT requests in CouchDB 1.0.2 . Once in = a while it happens that a stored document is corrupt, i.e. an attachment = is truncated or has no data at all. CouchDB does not return an error = though when the document is created. >>=20 >> The description of COUCHDB-558 makes me think that this is a problem = that's not unheard of: >>=20 >> "We could detect in-flight data corruption if a client sends a = Content-MD5 header along with the data and Couch validates the MD5 on = arrival." >>=20 >> Now my question is: what might cause such an in-flight data = corruption? And what could I do to prevent it? Or if I cannot prevent = it, can I at least make CouchDB detect it during creation? >>=20 >> Thanks in advance >> Tillmann >>=20 ----------------------------------- Tillmann Seidel Innoopract Informationssysteme GmbH Email: tseidel@innoopract.com Tel: +49-721-66-47-33-0 Fax: +49-721-66-47-33-29 http://www.innoopract.com Innoopract Informationssysteme GmbH Lammstr. 21, 76133 Karlsruhe, Germany General Manager: Jochen Krause=20 Registered Office: Karlsruhe, Commercial Register Mannheim HRB 107883