Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 50BB09263 for ; Fri, 13 Jul 2012 00:40:35 +0000 (UTC) Received: (qmail 18285 invoked by uid 500); 13 Jul 2012 00:40:35 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 18243 invoked by uid 500); 13 Jul 2012 00:40:35 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 18235 invoked by uid 99); 13 Jul 2012 00:40:35 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Jul 2012 00:40:35 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 0C728142856 for ; Fri, 13 Jul 2012 00:40:35 +0000 (UTC) Date: Fri, 13 Jul 2012 00:40:35 +0000 (UTC) From: "Lars Hofhansl (JIRA)" To: issues@hbase.apache.org Message-ID: <1247040617.45357.1342140035052.JavaMail.jiratomcat@issues-vm> In-Reply-To: <292939911.38893.1342048654671.JavaMail.jiratomcat@issues-vm> Subject: [jira] [Commented] (HBASE-6375) Master may be using a stale list of region servers for creating assignment plan during startup MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-6375?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1341= 3375#comment-13413375 ]=20 Lars Hofhansl commented on HBASE-6375: -------------------------------------- Committed to 0.94 as well. =20 > Master may be using a stale list of region servers for creating assignmen= t plan during startup > -------------------------------------------------------------------------= --------------------- > > Key: HBASE-6375 > URL: https://issues.apache.org/jira/browse/HBASE-6375 > Project: HBase > Issue Type: Bug > Components: master > Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0 > Environment: All > Reporter: Aditya Kishore > Assignee: Aditya Kishore > Fix For: 0.96.0, 0.94.1 > > Attachments: HBASE-6375_94.patch, HBASE-6375_trunk.patch > > > While investigating an Out of Memory issue, I had an interesting observat= ion where the master tries to assign all regions to a single region server = even though 7 other had already registered with it. > As the cluster had MSLAB enabled, this resulted in OOM on the RS when it = tired to open all of them. > *From master's log (edited for brevity):* > {quote} > 55,468=C2=A0Waiting=C2=A0on=C2=A0regionserver(s)=C2=A0to=C2=A0checkin > 56,968=C2=A0Waiting=C2=A0on=C2=A0regionserver(s)=C2=A0to=C2=A0checkin > 58,468=C2=A0Waiting=C2=A0on=C2=A0regionserver(s)=C2=A0to=C2=A0checkin > 59,968=C2=A0Waiting=C2=A0on=C2=A0regionserver(s)=C2=A0to=C2=A0checkin > 01,242=C2=A0Registering=C2=A0server=3Dsrv109.datacenter,60020,13386739205= 29,regionCount=3D0,userLoad=3Dfalse > 01,469=C2=A0Waiting=C2=A0on=C2=A0regionserver(s)=C2=A0count=C2=A0to=C2=A0= settle;=C2=A0currently=3D1 > 02,969=C2=A0Finished=C2=A0waiting=C2=A0for=C2=A0regionserver=C2=A0count= =C2=A0to=C2=A0settle;=C2=A0count=3D1,sleptFor=3D46500 > 02,969=C2=A0Exiting=C2=A0wait=C2=A0on=C2=A0regionserver(s)=C2=A0to=C2=A0c= heckin;=C2=A0count=3D1,=C2=A0stopped=3Dfalse,count=C2=A0of=C2=A0regions=C2= =A0out=C2=A0on=C2=A0cluster=3D0 > 03,010=C2=A0Processing=C2=A0region=C2=A0\-ROOT\-,,0.70236052=C2=A0in=C2= =A0state=C2=A0M_ZK_REGION_OFFLINE > 03,220=C2=A0\-ROOT\-=C2=A0assigned=3D0,=C2=A0rit=3Dtrue,=C2=A0location=3D= srv109.datacenter:60020 > 03,221=C2=A0Processing=C2=A0region=C2=A0.META.,,1.1028785192=C2=A0in=C2= =A0state=C2=A0M_ZK_REGION_OFFLINE > 03,336=C2=A0Detected=C2=A0completed=C2=A0assignment=C2=A0of=C2=A0META,=C2= =A0notifying=C2=A0catalog=C2=A0tracker > 03,350=C2=A0.META.=C2=A0assigned=3D0,=C2=A0rit=3Dtrue,=C2=A0location=3Dsr= v109.datacenter:60020 > 03,350=C2=A0Master=C2=A0startup=C2=A0proceeding:=C2=A0cluster=C2=A0startu= p > 04,006=C2=A0Registering=C2=A0server=3Dsrv111.datacenter,60020,13386739233= 99,regionCount=3D0,userLoad=3Dfalse > 04,012=C2=A0Registering=C2=A0server=3Dsrv113.datacenter,60020,13386739235= 32,regionCount=3D0,userLoad=3Dfalse > 04,269=C2=A0Registering=C2=A0server=3Dsrv115.datacenter,60020,13386739234= 71,regionCount=3D0,userLoad=3Dfalse > 04,363=C2=A0Registering=C2=A0server=3Dsrv117.datacenter,60020,13386739239= 28,regionCount=3D0,userLoad=3Dfalse > 04,599=C2=A0Registering=C2=A0server=3Dsrv127.datacenter,60020,13386739240= 67,regionCount=3D0,userLoad=3Dfalse > 04,606=C2=A0Registering=C2=A0server=3Dsrv119.datacenter,60020,13386739239= 53,regionCount=3D0,userLoad=3Dfalse > 04,804=C2=A0Registering=C2=A0server=3Dsrv129.datacenter,60020,13386739243= 39,regionCount=3D0,userLoad=3Dfalse > 05,126=C2=A0Bulk=C2=A0assigning=C2=A01252=C2=A0region(s)=C2=A0across=C2= =A01=C2=A0server(s),=C2=A0retainAssignment=3Dtrue > 05,546=C2=A0hd109.datacenter,60020,1338673920529=C2=A0unassigned=C2=A0zno= des=3D207=C2=A0of > {quote} > *A peek at AssignmentManager code offer some explanation:* > {code} > public void assignAllUserRegions() throws IOException, InterruptedExcep= tion { > // Get all available servers > List servers =3D serverManager.getOnlineServersList(); > // Scan META for all user regions, skipping any disabled tables > Map allRegions =3D > MetaReader.fullScan(catalogTracker, this.zkTable.getDisabledTables(= ), true); > if (allRegions =3D=3D null || allRegions.isEmpty()) return; > // Determine what type of assignment to do on startup > boolean retainAssignment =3D master.getConfiguration(). > getBoolean("hbase.master.startup.retainassign", true); > Map> bulkPlan =3D null; > if (retainAssignment) { > // Reuse existing assignment info > bulkPlan =3D LoadBalancer.retainAssignment(allRegions, servers); > } else { > // assign regions in round-robin fashion > bulkPlan =3D LoadBalancer.roundRobinAssignment(new ArrayList(allRegions.keySet()), servers); > } > LOG.info("Bulk assigning " + allRegions.size() + " region(s) across "= + > servers.size() + " server(s), retainAssignment=3D" + retainAssignme= nt); > ... > {code} > In the function assignAllUserRegions(), listed above, AM fetches the serv= er list from ServerManager long before it actually use it to create assignm= ent plan. > In between these, it performs a full scan of META to create an assignment= map of regions. So even if additional RSes have registered in the meantime= (as happened in this case), AM still has the old list of just one server. > This code snippet is from 0.90.6 but the same issue exists in 0.92, 0.94 = and trunk. Since MSLAB is enabled by default in 0.92 onwards, any large clu= ster can hit this issue upon cluster start-up when the following sequence h= olds true. > # Master start long before the RSes (by default this long ~=3D 4.5 second= s) > # All the RSes start togather but one wins the race of registering with M= aster by few seconds. > I am attaching a patch for the trunk which moves the code which fetches t= he RS list form the beginning of the function to where it is first use. > Apart from this change, one other HBase setting that now becomes importan= t is "hbase.master.wait.on.regionservers.mintostart" due to MSLAB being ena= bled by default. > In large clusters which keeps it enabled now must modify "hbase.master.wa= it.on.regionservers.mintostart" to a suitable number than the default of 1 = to ensure that the master waits for a quorum of RSes which are sufficient t= o open all the regions among themselves. I'll create a separate JIRA for th= e documentation change. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs: https://issues.apache.org/jira/secure/ContactAdministrators!default.jsp= a For more information on JIRA, see: http://www.atlassian.com/software/jira