Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 55382 invoked from network); 29 Mar 2010 21:46:22 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 29 Mar 2010 21:46:22 -0000 Received: (qmail 5945 invoked by uid 500); 29 Mar 2010 21:46:22 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 5889 invoked by uid 500); 29 Mar 2010 21:46:22 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 5881 invoked by uid 99); 29 Mar 2010 21:46:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Mar 2010 21:46:22 +0000 X-ASF-Spam-Status: No, hits=1.9 required=10.0 tests=AWL,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yuzhihong@gmail.com designates 74.125.92.26 as permitted sender) Received: from [74.125.92.26] (HELO qw-out-2122.google.com) (74.125.92.26) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Mar 2010 21:46:15 +0000 Received: by qw-out-2122.google.com with SMTP id 8so3972477qwh.35 for ; Mon, 29 Mar 2010 14:45:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:received:message-id:subject:from:to:content-type; bh=xw/HMImzgi/e35n1Vp839o1FSQ0MAoWFYA8cwgSlQNg=; b=qYVGtLB4zTY7yShWooMp/oPxqwUjeN1POvi2uzusJEd7FVWwdySaU/OrUsFn9yxutP HV7g3jyq7A6erEf2Sxdll2Fq6NL5tua8DWCP4KKEDc0gr7gzgehA1d4OHwlTTF0o3wPt i4AOP5ggNU2cYdcIsPfRDo/q1dzIRk5XXz8wg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=qvuU68813V1Nx/ELBl4HNpQM6ZNxquT+m6syBOoob/J1SOxbwwM96aoe3SLv6Ukurf bBwGKL1aKbg2UgpC/yznhUBQINsuxqaYbzccqbeOYEyRQEN7syIksWrQ8rXog21w18Er 8Eov5yg7t/0qfEb4W95AoHWR+6x7Sb98QKkXk= MIME-Version: 1.0 Received: by 10.229.39.1 with HTTP; Mon, 29 Mar 2010 14:45:49 -0700 (PDT) In-Reply-To: <17e273101003291435w6b160921xeda7d457f3f7226@mail.gmail.com> References: <17e273101003291401m7b692d41i3fb8a51068b5c7c3@mail.gmail.com> <31a243e71003291425w5e3b3024h1ced276db047af89@mail.gmail.com> <17e273101003291435w6b160921xeda7d457f3f7226@mail.gmail.com> Date: Mon, 29 Mar 2010 14:45:49 -0700 Received: by 10.229.189.16 with SMTP id dc16mr471517qcb.92.1269899150029; Mon, 29 Mar 2010 14:45:50 -0700 (PDT) Message-ID: <17e273101003291445q42352f9fk3f32a4631f080cc9@mail.gmail.com> Subject: Re: connection.isMasterRunning() From: Ted Yu To: hbase-dev@hadoop.apache.org Content-Type: multipart/alternative; boundary=00163630f4c5bd15820482f76f3e --00163630f4c5bd15820482f76f3e Content-Type: text/plain; charset=ISO-8859-1 More background on why I brought up the issue. Since we cannot cache hbaseadmin, we call the following code in a dedicated thread: try { HBaseConfiguration _conf = new HBaseConfiguration(); HBaseAdmin _admin = new HBaseAdmin(_conf); isConnected = true; return _admin; } catch (ConcurrentModificationException _e) { _ex = _e; } After some time, I saw: Exception in thread "HBase-Connector" java.lang.OutOfMemoryError: Java heap space at com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.createChunk(DeferredDocumentImpl.java:1921) at com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.ensureCapacity(DeferredDocumentImpl.java:1826) at com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.createNode(DeferredDocumentImpl.java:1840) at com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl.createDeferredTextNode(DeferredDocumentImpl.java:534) at com.sun.org.apache.xerces.internal.parsers.AbstractDOMParser.characters(AbstractDOMParser.java:1189) at com.sun.org.apache.xerces.internal.xinclude.XIncludeHandler.characters(XIncludeHandler.java:1082) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:463) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:807) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:107) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:225) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:283) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1078) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1029) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:979) at org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1015) at org.apache.hadoop.hbase.HBaseConfiguration.hashCode(HBaseConfiguration.java:63) at java.util.WeakHashMap.put(WeakHashMap.java:401) at org.apache.hadoop.conf.Configuration.(Configuration.java:213) at org.apache.hadoop.conf.Configuration.(Configuration.java:197) at org.apache.hadoop.hbase.HBaseConfiguration.(HBaseConfiguration.java:33) at net.ks.datastore.HBaseDataStore.getHBaseAdmin(HBaseDataStore.java:159) at net.ks.datastore.HBaseDataStore.isConnected(HBaseDataStore.java:476) at net.ks.datastore.HBaseDataStore.access$100(HBaseDataStore.java:39) at net.ks.datastore.HBaseDataStore$1.run(HBaseDataStore.java:71) On Mon, Mar 29, 2010 at 2:35 PM, Ted Yu wrote: > https://issues.apache.org/jira/browse/HBASE-2391 has been filed. > > We call HBaseAdmin.isMasterRunning() to see if client has connection with > HBase. If there is a working alternative, we'd love to use it. > > Thanks J-D. > > > On Mon, Mar 29, 2010 at 2:25 PM, Jean-Daniel Cryans wrote: > >> I think this method wasn't updated when we moved to Zookeeper (since >> in pre-0.20, dead master = dead cluster), also looking at when this is >> called, I only see it from HMerge and HBaseAdmin.isMasterRunning()... >> which in turn isn't called anywhere in the java code (I think we call >> it in the shell tho). >> >> So we should first consider if this method is useful at all, and if so >> then what would be the best fix. For example, if you run a HMerge >> while the master is down but the region servers are up, you're going >> to end up with something wrong since it expects hbase to be completely >> down. >> >> Can you open a jira to continue discussions there? >> >> Thx! >> >> J-D >> >> On Mon, Mar 29, 2010 at 2:01 PM, Ted Yu wrote: >> > Hi, >> > While going over TableServers.isMasterRunning() in HConnectionManager, I >> see >> > this: >> > public boolean isMasterRunning() { >> > if (this.master == null) { >> > try { >> > getMaster(); >> > >> > } catch (MasterNotRunningException e) { >> > return false; >> > } >> > } >> > return true; >> > } >> > When isMasterRunning() is called the first time, if master is obtained >> > successfully, master field would contain reference to HMasterInterface. >> > Subsequent calls to isMasterRunning() wouldn't throw >> > MasterNotRunningException even if master server stops responding to >> clients. >> > >> > I think master.isMasterRunning() should be called if master isn't null. >> > >> > Regards >> > >> > > --00163630f4c5bd15820482f76f3e--