Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A5E219A90 for ; Tue, 17 Jan 2012 15:20:09 +0000 (UTC) Received: (qmail 54408 invoked by uid 500); 17 Jan 2012 15:20:07 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 54234 invoked by uid 500); 17 Jan 2012 15:20:06 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 54218 invoked by uid 99); 17 Jan 2012 15:20:06 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Jan 2012 15:20:06 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.214.41] (HELO mail-bk0-f41.google.com) (209.85.214.41) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Jan 2012 15:19:57 +0000 Received: by bkbzx1 with SMTP id zx1so953917bkb.14 for ; Tue, 17 Jan 2012 07:19:37 -0800 (PST) Received: by 10.204.156.18 with SMTP id u18mr3010584bkw.32.1326813577241; Tue, 17 Jan 2012 07:19:37 -0800 (PST) Received: from [10.10.10.20] ([217.188.200.242]) by mx.google.com with ESMTPS id b9sm47836749bks.6.2012.01.17.07.19.35 (version=SSLv3 cipher=OTHER); Tue, 17 Jan 2012 07:19:36 -0800 (PST) Message-ID: <4F159185.80805@zfabrik.de> Date: Tue, 17 Jan 2012 16:19:33 +0100 From: Henning Blohm User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:6.0.2) Gecko/20110902 Thunderbird/6.0.2 MIME-Version: 1.0 To: user@hbase.apache.org Subject: ROOT region not assigned Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hi, After an upgrade of hadoop and hbase (to 0.90.4-cdh3u2) from 0.90 hbase and 0.20-append hadoop on a single node test installation everything worked fine initially. Then there was some DNS changes and host name changes which resulted in a lot "hostname cannot be resolved" problems in the logs and the master web interface would only show a stack trace from a bad lookup ("hostname can't be null"). So I changed the host name back to its old name. All configuration in hbase/hadoop points to localhost (i.e. in the *-site.xml, slaves, masters, regionservers). We are running distributed mode (but on one machine). Now, the HMaster process does come up again somewhat and I get the web interface, but it stays in Currently running tasks: Master Startup, Assigning ROOT region, 10050s i.e. it never continues. The hadoop data directory was not touched other than during the upgrade which completed successfully and data was available after the upgrade. No, of course no data is available anymore. This is really scary, as we plan to do similar upgrades in production environments and I would like to understand what could possibly screw things up to badly. The HMaster log shows things like: 2012-01-17 12:48:39,712 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: -ROOT-,,0.70236052 state=OPEN, ts=1326707373329, server=application1,60020,1326707331441 2012-01-17 12:48:39,713 ERROR org.apache.hadoop.hbase.master.AssignmentManager: Region has been OPEN for too long, we don't know where region was opened so can't do anything The region server log has a lot of these: 2012-01-17 13:03:36,774 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: NotServingRegionException; Region is not online: -ROOT-,,0 Help would be great!! Thanks, Henning