Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 73711 invoked from network); 18 Dec 2006 05:59:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 18 Dec 2006 05:59:14 -0000 Received: (qmail 83208 invoked by uid 500); 18 Dec 2006 05:59:21 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 83197 invoked by uid 500); 18 Dec 2006 05:59:20 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 83188 invoked by uid 99); 18 Dec 2006 05:59:20 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Dec 2006 21:59:20 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of jagadeesh.vn@gmail.com designates 209.85.132.245 as permitted sender) Received: from [209.85.132.245] (HELO an-out-0708.google.com) (209.85.132.245) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Dec 2006 21:59:09 -0800 Received: by an-out-0708.google.com with SMTP id c3so350043ana for ; Sun, 17 Dec 2006 21:58:48 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:to:subject:date:mime-version:content-type:content-transfer-encoding:x-mailer:x-mimeole:thread-index:in-reply-to:from:message-id; b=LDYEKhd/DON1N+w5O+hZkRnLUpreyoB59r0ywkMmyuWNxu4eF6q0a0dvWU84Bo0Q6tucGjyKhIuUD2lq+q45FdO3suhSljAsE7HDk620Pms0MJa5LykQxM6aVZbwB+1enIEWcYKAjbtJ3KwtXlB02ft74i93Wg2OACuP83bhT7g= Received: by 10.100.173.19 with SMTP id v19mr2780800ane.1166421528454; Sun, 17 Dec 2006 21:58:48 -0800 (PST) Received: from xmlaptop05 ( [61.17.42.35]) by mx.google.com with ESMTP id b32sm9106750ana.2006.12.17.21.58.46; Sun, 17 Dec 2006 21:58:47 -0800 (PST) To: Subject: Urgent: Production Issues Date: Mon, 18 Dec 2006 11:30:11 +0530 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.5510 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2869 Thread-Index: Accgf7TSDM1G8uZDQDij2yGafr9tsgB6APIA In-Reply-To: <4582F704.2080007@yahoo-inc.com> From: Jagadeesh Message-ID: <45862e17.2890e523.03cb.42e2@mx.google.com> X-Virus-Checked: Checked by ClamAV on apache.org Hi All, I am running Hadoop 0.7.2 in a production environment and it has stored ~170GB of data. Please read below the deployment architecture I am using. I am using 4 nodes with 1.3TB storage each and the master node is not being used for storage. So I have 5 servers in total out of which 4 servers are running Hadoop nodes. This setup was working fine for the last 20-25 days and there were no issues. As mentioned earlier, now the total storage has gone upto ~170GB. Couple of days back, I noticed an error where Hadoop was not accepting new files, I mean the upload always failed, but download was still working great. I was getting the exception, writing .crc failed. When I tried restarting the service, I was getting the message, jobtracker not available and tasktracker not available. Then I had to kill all the processes in the master node as well as in the client nodes to restart the service. After that everything worked fine for a day more and now I keep on getting the message failure closing block of file /user/root/.LICENSE.txt2233331.crc to node node1:50010 Even if I restart the service, I get this message after 10 minutes. I read in the mailing list that this issues is resolved in 0.9.0, but I am a bit skeptical about moving to 0.9.0 as I don't know whether I will end up loosing the files that are already stored. Kindly confirm this and I wil move to 0.9.0 and also please tell me the steps or pre-cautions I should take before moving to 0.9.0. Thanks and Regards Jugs