Return-Path: X-Original-To: apmail-couchdb-user-archive@www.apache.org Delivered-To: apmail-couchdb-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 084CBF65F for ; Mon, 29 Apr 2013 06:28:24 +0000 (UTC) Received: (qmail 7239 invoked by uid 500); 29 Apr 2013 06:28:22 -0000 Delivered-To: apmail-couchdb-user-archive@couchdb.apache.org Received: (qmail 6791 invoked by uid 500); 29 Apr 2013 06:28:16 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 6713 invoked by uid 99); 29 Apr 2013 06:28:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Apr 2013 06:28:13 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests=NORMAL_HTTP_TO_IP,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [128.200.36.30] (HELO translab.its.uci.edu) (128.200.36.30) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Apr 2013 06:28:08 +0000 Received: from translab.its.uci.edu (localhost.localdomain [127.0.0.1]) by translab.its.uci.edu (8.13.1/8.12.10) with ESMTP id r3T6RmKj015589 for ; Sun, 28 Apr 2013 23:27:48 -0700 Received: (from jmarca@localhost) by translab.its.uci.edu (8.13.1/8.13.1/Submit) id r3T6Rlbg015588 for user@couchdb.apache.org; Sun, 28 Apr 2013 23:27:47 -0700 Date: Sun, 28 Apr 2013 23:27:47 -0700 From: James Marca To: user@couchdb.apache.org Subject: An old database causes couchdb 1.3.0 to crash getting single documents Message-ID: <20130429062747.GB3218@translab.its.uci.edu> Mail-Followup-To: user@couchdb.apache.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Mutt/1.4.1i X-Virus-Checked: Checked by ClamAV on apache.org Hello list, I have an old database that I've carried along for a year or so now through various upgrades, and although it has given me problems in the past, I've never really dealt with them, but am trying to do so now. I wrote the docs into the db and used them with couchdb 0.9 (more or less). I finished the analysis a while ago, but now I am trying to clean up dbs and close up this project, but was crashing couch when accessing docs and trying to rebuild views (Apparently 1.3.0 requires a view rebuild). =20 So I am trying to fix this db once and for all by fetching each document and writing into a new couchdb. I have two problems. =20 First, for reasons I do not understand at all, many of the docs in this db are corrupt, or just plain old too big. This DB was the one that bore the brunt of my experimentations with big docs in couchdb, and I might have written really big ones into it. Anyway, what happens is that the GET will cause CouchDB to fill up RAM and die. My second problem is to wonder whether there is a way to short circuit that slow death, as it takes forever and I have over 10 million documents to process. =20 What I am doing now: I *can* get all of the doc ids via the all_docs interface, as long as I do not ask for the document contents. So what I am doing is getting a batch of 1000 doc ids, fetching each one, and when I hit a bad one that causes couchdb to die, I wait 10 seconds for couchdb to restart, note the bad doc, and move on to the next one I looked through the configuration settings, and I don't see an obvious way to tell CouchDB to abort if RAM exceeds some pre-set limit, nor do I see a way to tell couch to abort a request if it is taking too long. My requests are plain gets, as in curl http://127.0.0.1/my%2fbroken%2fdb/00016ed321e51ef4b89db5f690c92c436772= 8b18f1298f179a3241f1de075bde (I'm actually using node.js to get the docs, but the crash will also happen in curl or via a browser) Sometimes I will see a **really** long dump in the error logs, and then=20 [{file,"couch_compress.erl"},{line,67}]= }, {couch_file,pread_term,2, [{file,"couch_file.erl"},{line,135}]}, {couch_db,make_doc,5, [{file,"couch_db.erl"},{line,1264}]}, {couch_db,open_doc_int,3, [{file,"couch_db.erl"},{line,1203}]}, {couch_db,open_doc,3, [{file,"couch_db.erl"},{line,141}]}, {couch_httpd_db,couch_doc_open,4, [{file,"couch_httpd_db.erl"},{line,802}= ]}, {couch_httpd_db,db_doc_req,3, [{file,"couch_httpd_db.erl"},{line,498}= ]}, {couch_httpd_db,do_db_req,2, [{file,"couch_httpd_db.erl"},{line,234}= ]}] [Mon, 29 Apr 2013 05:53:51 GMT] [info] [<0.262.0>] 127.0.0.1 - - GET /my%2fbroken%2fdb/00017ea256817bb6c4347= 4a8d8eb12d141425c3e7b51fe6ef05729b08c41b91a 500 [Mon, 29 Apr 2013 05:53:52 GMT] [error] [<0.262.0>] httpd 500 error response: {"error":"unknown_error","reason":"function_clause"} and couchdb does NOT crash. Most of the time the logs just say: =20 [Mon, 29 Apr 2013 05:54:43 GMT] [error] [<0.141.0>] function_clause error i= n HTTP request and then it crashes, and then next entries in the log file are restarts of all the replications, etc as per usual start up procedures. When the docs work I can process 10 to 20 every second. If I could temporarily tell couchdb to abort request that take more than a second, that would do the trick, but I can't see how to do that. If I could ask couch how big a document is prior to processing it, I could skip processing the really big ones, but I can't see how to do that. I've thought about compacting the db under 1.3.0. This *might* fix the problem, but in the worst case I will wait around a day or so for the compaction to finish, and then find that I still have a problem when couchdb to send a document out. Any advice on hacking a solution or tweaking a parameter setting would be very much appreciated, as I have 12,970,464 documents to get and save (102.4 GB) and waiting a minute or so every time I hit a bad doc is just taking too long. Finally, I *was* able to use my views to access data under 1.2.x. My main data view basically broke up each document into a ton of emit()s, and I only grabbed the view output, never the original docs. But when I upgraded to 1.3.0, all the views needed a rebuild. Other, similar DBs worked fine, but this one crashed CouchDB every time I tried. Regards, James Marca