Return-Path: X-Original-To: apmail-hbase-commits-archive@www.apache.org Delivered-To: apmail-hbase-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B019A9E8D for ; Tue, 7 Feb 2012 22:52:03 +0000 (UTC) Received: (qmail 96965 invoked by uid 500); 7 Feb 2012 22:52:03 -0000 Delivered-To: apmail-hbase-commits-archive@hbase.apache.org Received: (qmail 96932 invoked by uid 500); 7 Feb 2012 22:52:03 -0000 Mailing-List: contact commits-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list commits@hbase.apache.org Received: (qmail 96925 invoked by uid 99); 7 Feb 2012 22:52:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Feb 2012 22:52:02 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Feb 2012 22:52:00 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id 29132238897D for ; Tue, 7 Feb 2012 22:51:41 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r1241681 - in /hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase: client/HBaseFsck.java client/HBaseLocalityCheck.java util/FSUtils.java util/RegionPlacement.java Date: Tue, 07 Feb 2012 22:51:40 -0000 To: commits@hbase.apache.org From: mbautin@apache.org X-Mailer: svnmailer-1.0.8-patched Message-Id: <20120207225141.29132238897D@eris.apache.org> Author: mbautin Date: Tue Feb 7 22:51:40 2012 New Revision: 1241681 URL: http://svn.apache.org/viewvc?rev=1241681&view=rev Log: [master] LocalityCK II -- the fine grained locality information Summary: In the last version of the localityck, it would return a very coarse grained locality result. Each region is either on the local region server or not. So the region locality is either 1 or 0. And the table locality = (sum of each region locality) / total regions. In the new generation, it will return a fine grained locality result. The region locality is defined as the percentage of the local blocks that its currently hosting region server has. Assuming each region has almost the same amount of total blocks, the average table locality = (sum of each region locality) / total regions. In this case, the table locality reflects a much more accurate data locality information in the cluster. Test Plan: After testing this tools on titanmigrate002 cluster, the following is the comparison between the previous localityck and the new generation of the localityck. Localityck I 12/01/26 22:33:58 INFO client.HBaseLocalityCheck: ======== Locality Summary =============== 12/01/26 22:33:58 INFO client.HBaseLocalityCheck: For Table: test1 ; #Total Regions: 470 ; # Local Regions 458 rate = 97.44681 % 12/01/26 22:33:58 INFO client.HBaseLocalityCheck: For Table: MigrationStatus ; #Total Regions: 1024 ; # Local Regions 1002 rate = 97.85156 % 12/01/26 22:33:58 INFO client.HBaseLocalityCheck: For Table: .META. ; #Total Regions: 1 ; # Local Regions 0 rate = 0.0 % 12/01/26 22:33:58 INFO client.HBaseLocalityCheck: For Table: -ROOT- ; #Total Regions: 1 ; # Local Regions 0 rate = 0.0 % 12/01/26 22:33:58 INFO client.HBaseLocalityCheck: For Table: test2 ; #Total Regions: 465 ; # Local Regions 460 rate = 98.92473 % Localityck II 12/01/26 22:32:57 INFO client.HBaseLocalityCheck: ======== Locality Summary =============== 12/01/26 22:32:57 INFO client.HBaseLocalityCheck: For Table: test1 ; #Total Regions: 470 ; The average locality is 73.989365 % 12/01/26 22:32:57 INFO client.HBaseLocalityCheck: For Table: MigrationStatus ; #Total Regions: 1024 ; The average locality is 80.80051 % 12/01/26 22:32:57 INFO client.HBaseLocalityCheck: For Table: .META. ; #Total Regions: 1 ; The average locality is 40.0 % 12/01/26 22:32:57 INFO client.HBaseLocalityCheck: For Table: -ROOT- ; #Total Regions: 1 ; The average locality is 0.0 % 12/01/26 22:32:57 INFO client.HBaseLocalityCheck: For Table: test2 ; #Total Regions: 465 ; The average locality is 96.61275 % Reviewers: kannan, kranganathan Reviewed By: kannan CC: hbase-eng@lists Differential Revision: https://phabricator.fb.com/D397713 Task ID: 900734 Modified: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/client/HBaseFsck.java hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/client/HBaseLocalityCheck.java hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/RegionPlacement.java Modified: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/client/HBaseFsck.java URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/client/HBaseFsck.java?rev=1241681&r1=1241680&r2=1241681&view=diff ============================================================================== --- hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/client/HBaseFsck.java (original) +++ hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/client/HBaseFsck.java Tue Feb 7 22:51:40 2012 @@ -821,7 +821,7 @@ public class HBaseFsck { /** * Stores the entries scanned from META */ - private static class MetaEntry extends HRegionInfo { + public static class MetaEntry extends HRegionInfo { HServerAddress regionServer; // server hosting this region long modTime; // timestamp of most recent modification metadata Modified: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/client/HBaseLocalityCheck.java URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/client/HBaseLocalityCheck.java?rev=1241681&r1=1241680&r2=1241681&view=diff ============================================================================== --- hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/client/HBaseLocalityCheck.java (original) +++ hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/client/HBaseLocalityCheck.java Tue Feb 7 22:51:40 2012 @@ -1,13 +1,9 @@ package org.apache.hadoop.hbase.client; -import java.io.DataOutputStream; -import java.io.FileOutputStream; import java.io.IOException; import java.util.HashMap; -import java.util.List; import java.util.Map; import java.util.TreeMap; -import java.util.concurrent.atomic.AtomicInteger; import org.apache.commons.cli.CommandLine; import org.apache.commons.cli.GnuParser; @@ -17,19 +13,15 @@ import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; -import org.apache.hadoop.hbase.HServerAddress; import org.apache.hadoop.hbase.MasterNotRunningException; import org.apache.hadoop.hbase.client.HBaseFsck.HbckInfo; -import org.apache.hadoop.hbase.master.HMaster; -import org.apache.hadoop.io.MapWritable; -import org.apache.hadoop.io.Text; -import org.apache.hadoop.io.Writable; +import org.apache.hadoop.hbase.util.FSUtils; public class HBaseLocalityCheck { private static final Log LOG = LogFactory.getLog(HBaseLocalityCheck.class .getName()); - private Map preferredRegionToRegionServerMapping = null; + private Map> localityMap = null; private Configuration conf; /** * The table we want to get locality for, or null in case we wanted a check @@ -61,82 +53,69 @@ public class HBaseLocalityCheck { * @throws IOException * @throws InterruptedException */ - public void showTableLocality() - throws MasterNotRunningException, IOException, InterruptedException { + public void showTableLocality() throws MasterNotRunningException, + IOException, InterruptedException { // create a fsck object HBaseFsck fsck = new HBaseFsck(conf); fsck.initAndScanRootMeta(); fsck.scanRegionServers(); TreeMap regionInfo = fsck.getRegionInfo(); - boolean localityMatch = false; - LOG.info("Locality information by region"); - // Get the locality info for each region by scanning the file system - preferredRegionToRegionServerMapping = HMaster.reevaluateRegionLocality(conf, - tableName, - conf.getInt("hbase.client.localityCheck.threadPoolSize", 2)); - - Map tableToRegionCountMap = - new HashMap(); - Map tableToRegionsWithLocalityMap = - new HashMap(); - - for (Map.Entry entry : - preferredRegionToRegionServerMapping.entrySet()) { - // get region name and table - String name = ((Text)entry.getKey()).toString(); - int spliterIndex =name.lastIndexOf(":"); - String regionName = name.substring(spliterIndex+1); - String tableName = name.substring(0, spliterIndex); - - //get region server hostname - String bestHostName = ((Text)entry.getValue()).toString(); - localityMatch = false; - HbckInfo region = regionInfo.get(regionName); - if (region != null && region.deployedOn != null && - region.deployedOn.size() != 0) { - String realHostName = null; - List serverList = region.deployedOn; - if (!tableToRegionCountMap.containsKey(tableName)){ - tableToRegionCountMap.put(tableName, new AtomicInteger(1)); - tableToRegionsWithLocalityMap.put(tableName, new AtomicInteger(0)); - } else { - tableToRegionCountMap.get(tableName).incrementAndGet(); - } + localityMap = FSUtils.getRegionDegreeLocalityMappingFromFS(conf); - realHostName = serverList.get(0).getHostname(); - if (realHostName.equalsIgnoreCase(bestHostName)) { - localityMatch = true; - tableToRegionsWithLocalityMap.get(tableName).incrementAndGet(); - } - - LOG.info(" : <" + name + "> is running on host: " - + realHostName + " \n and the prefered host is " + bestHostName + - " [" + (localityMatch ? "Matched]" : "NOT matched]")); + Map tableToRegionCntMap = new HashMap(); + Map tableToLocalityMap = new HashMap(); + int numUnknownRegion = 0; + + for (Map.Entry entry : regionInfo.entrySet()) { + String regionEncodedName = entry.getKey(); + Map localityInfo = localityMap.get(regionEncodedName); + HbckInfo hbckInfo = entry.getValue(); + if (hbckInfo == null || hbckInfo.metaEntry == null + || localityInfo == null || hbckInfo.deployedOn == null + || hbckInfo.deployedOn.size() == 0) { + LOG.warn("<" + regionEncodedName + "> no info" + + " obtained for this region from any of the region servers."); + numUnknownRegion++; + continue; + } + String tableName = hbckInfo.metaEntry.getTableDesc().getNameAsString(); + String realHostName = hbckInfo.deployedOn.get(0).getHostname(); + Float localityPercentage = localityInfo.get(realHostName); + if (localityPercentage == null) + localityPercentage = new Float(0); + + if (!tableToRegionCntMap.containsKey(tableName)) { + tableToRegionCntMap.put(tableName, new Integer(1)); + tableToLocalityMap.put(tableName, localityPercentage); } else { - LOG.info(" : <" + name + "> no info obtained for this" + - " region from any of the region servers."); - continue; + tableToRegionCntMap.put(tableName, + (tableToRegionCntMap.get(tableName) + 1)); + tableToLocalityMap.put(tableName, + (tableToLocalityMap.get(tableName) + localityPercentage)); } + LOG.info("<" + tableName + " : " + regionEncodedName + + "> is running on host: " + realHostName + " \n " + + "and the locality is " + localityPercentage); } LOG.info("======== Locality Summary ==============="); - for(String tableName : tableToRegionCountMap.keySet()) { - int totalRegions = tableToRegionCountMap.get(tableName).get(); - int totalRegionsWithLocality = - tableToRegionsWithLocalityMap.get(tableName).get(); - - float rate = (totalRegionsWithLocality / (float) totalRegions) * 100; - LOG.info("For Table: "+tableName+" ; #Total Regions: " + totalRegions + - " ;" + " # Local Regions " + totalRegionsWithLocality + " rate = " - + rate + " %"); + for (String tableName : tableToRegionCntMap.keySet()) { + int totalRegions = tableToRegionCntMap.get(tableName).intValue(); + float totalRegionsLocality = tableToLocalityMap.get(tableName) + .floatValue(); + + float averageLocality = (totalRegionsLocality / (float) totalRegions); + LOG.info("For Table: " + tableName + " ; #Total Regions: " + totalRegions + + " ; The average locality is " + averageLocality * 100 + " %"); + } + if (numUnknownRegion != 0) { + LOG.info("The number of unknow regions is " + numUnknownRegion); } - } - public static void main(String[] args) throws IOException, InterruptedException { long startTime = System.currentTimeMillis(); Modified: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java?rev=1241681&r1=1241680&r2=1241681&view=diff ============================================================================== --- hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java (original) +++ hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java Tue Feb 7 22:51:40 2012 @@ -796,14 +796,13 @@ public class FSUtils { * @throws IOException * in case of file system errors or interrupts */ - public static MapWritable getRegionLocalityMappingFromFS( - final FileSystem fs, final Path rootPath, int threadPoolSize, - final Configuration conf, final String desiredTable) - throws IOException { + public static MapWritable getRegionLocalityMappingFromFS(final FileSystem fs, + final Path rootPath, int threadPoolSize, final Configuration conf, + final String desiredTable) throws IOException { // region name to its best locality region server mapping MapWritable regionToBestLocalityRSMapping = new MapWritable(); - getRegionLocalityMappingFromFS(fs, rootPath, threadPoolSize, conf, - desiredTable, regionToBestLocalityRSMapping, null); + getRegionLocalityMappingFromFS(conf, desiredTable, threadPoolSize, + regionToBestLocalityRSMapping, null); return regionToBestLocalityRSMapping; } @@ -812,12 +811,6 @@ public class FSUtils { * degree of locality for each region on each of the servers having at least * one block of that region. * - * @param fs - * the file system to use - * @param rootPath - * the root path to start from - * @param threadPoolSize - * the thread pool size to use * @param conf * the configuration to use * @return the mapping from region encoded name to a map of server names to @@ -826,10 +819,11 @@ public class FSUtils { * in case of file system errors or interrupts */ public static Map> getRegionDegreeLocalityMappingFromFS( - final FileSystem fs, final Path rootPath, int threadPoolSize, final Configuration conf) throws IOException { - return getRegionDegreeLocalityMappingFromFS(fs, rootPath, threadPoolSize, - conf, null); + return getRegionDegreeLocalityMappingFromFS( + conf, null, + conf.getInt("hbase.client.localityCheck.threadPoolSize", 2)); + } /** @@ -837,28 +831,24 @@ public class FSUtils { * degree of locality for each region on each of the servers having at least * one block of that region. * - * @param fs - * the file system to use - * @param rootPath - * the root path to start from - * @param threadPoolSize - * the thread pool size to use * @param conf * the configuration to use * @param desiredTable * the table you wish to scan locality for + * @param threadPoolSize + * the thread pool size to use * @return the mapping from region encoded name to a map of server names to * locality fraction * @throws IOException * in case of file system errors or interrupts */ public static Map> getRegionDegreeLocalityMappingFromFS( - final FileSystem fs, final Path rootPath, int threadPoolSize, - final Configuration conf, final String desiredTable) throws IOException { + final Configuration conf, final String desiredTable, int threadPoolSize) + throws IOException { Map> regionDegreeLocalityMapping = new ConcurrentHashMap>(); - getRegionLocalityMappingFromFS(fs, rootPath, threadPoolSize, conf, - desiredTable, null, regionDegreeLocalityMapping); + getRegionLocalityMappingFromFS(conf, desiredTable, threadPoolSize, null, + regionDegreeLocalityMapping); return regionDegreeLocalityMapping; } @@ -868,10 +858,6 @@ public class FSUtils { * degree of locality of each region on each of the servers having at least * one block of that region. The output map parameters are both optional. * - * @param fs - * the file system to use - * @param rootPath - * the root path to start from * @param threadPoolSize * the thread pool size to use * @param conf @@ -887,11 +873,13 @@ public class FSUtils { * in case of file system errors or interrupts */ private static void getRegionLocalityMappingFromFS( - final FileSystem fs, final Path rootPath, int threadPoolSize, final Configuration conf, final String desiredTable, + int threadPoolSize, MapWritable regionToBestLocalityRSMapping, Map> regionDegreeLocalityMapping) throws IOException { + FileSystem fs = FileSystem.get(conf); + Path rootPath = FSUtils.getRootDir(conf); long startTime = System.currentTimeMillis(); Path queryPath; if (null == desiredTable) { Modified: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/RegionPlacement.java URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/RegionPlacement.java?rev=1241681&r1=1241680&r2=1241681&view=diff ============================================================================== --- hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/RegionPlacement.java (original) +++ hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/util/RegionPlacement.java Tue Feb 7 22:51:40 2012 @@ -17,7 +17,6 @@ import org.apache.commons.cli.ParseExcep import org.apache.commons.logging.Log; import org.apache.commons.logging.LogFactory; import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HConstants; import org.apache.hadoop.hbase.HRegionInfo; @@ -30,9 +29,6 @@ import org.apache.hadoop.hbase.client.Me import org.apache.hadoop.hbase.client.MetaScanner.MetaScannerVisitor; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.client.Result; -import org.apache.hadoop.hbase.util.FSUtils; -import org.apache.hadoop.hbase.util.MunkresAssignment; -import org.apache.hadoop.hbase.util.Writables; import org.apache.hadoop.net.DNSToSwitchMapping; import org.apache.hadoop.net.IPv4AddressTruncationMapping; @@ -60,8 +56,7 @@ public class RegionPlacement { private Map rackCache; private final boolean enforceRackPolicy; - public RegionPlacement(Configuration conf, boolean enforceRackPolicy) - throws IOException { + public RegionPlacement(Configuration conf, boolean enforceRackPolicy) { this.conf = conf; this.switchMapping = new IPv4AddressTruncationMapping(); this.rackCache = new HashMap(); @@ -108,8 +103,7 @@ public class RegionPlacement { // Get the locality for each region to each server. Map> localityMap = - FSUtils.getRegionDegreeLocalityMappingFromFS(FileSystem.get(conf), - FSUtils.getRootDir(conf), 2, conf); + FSUtils.getRegionDegreeLocalityMappingFromFS(conf); // Transform the locality mapping into a 2D array, assuming that any // unspecified locality value is 0.