hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8778) Region assigments scan table directory making them slow for huge tables
Date Tue, 30 Jul 2013 22:35:49 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13724528#comment-13724528
] 

Ted Yu commented on HBASE-8778:
-------------------------------

Patch v3 only had one hunk which needs to be resolved.
I noticed the following in test output of TestMetaMigrationConvertingToPB which was responsible
for the IllegalArgumentException mentioned above:
{code}
2013-07-30 15:27:59,630 DEBUG [RpcServer.handler=0,port=51198] util.FSTableDescriptors(189):
Exception during readTableDecriptor. Current table name = TestTable
org.apache.hadoop.hbase.TableInfoMissingException: No table descriptor file under hdfs://localhost:51139/user/tyu/hbase/TestTable
	at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorAndModtime(FSTableDescriptors.java:506)
	at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorAndModtime(FSTableDescriptors.java:499)
	at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:184)
	at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:144)
	at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:3450)
	at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:14390)
{code}
This is due to FSTableDescriptors#getTableInfoPath() only checking in the new tableinfo dir
which didn't exist in the tar ball that is used by the test:
{code}
  private static FileStatus getTableInfoPath(FileSystem fs, Path tableDir, boolean removeOldFiles)
  throws IOException {
    Path tableInfoDir = new Path(tableDir, TABLEINFO_DIR);
    return getCurrentTableInfoStatus(fs, tableInfoDir, removeOldFiles);
{code}
                
> Region assigments scan table directory making them slow for huge tables
> -----------------------------------------------------------------------
>
>                 Key: HBASE-8778
>                 URL: https://issues.apache.org/jira/browse/HBASE-8778
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Dave Latham
>            Assignee: Dave Latham
>            Priority: Critical
>             Fix For: 0.98.0, 0.95.2
>
>         Attachments: 8778-dirmodtime.txt, HBASE-8778-0.94.5.patch, HBASE-8778-0.94.5-v2.patch,
HBASE-8778.patch, HBASE-8778-v2.patch, HBASE-8778-v3.patch, HBASE-8778-v4.patch
>
>
> On a table with 130k regions it takes about 3 seconds for a region server to open a region
once it has been assigned.
> Watching the threads for a region server running 0.94.5 that is opening many such regions
shows the thread opening the reigon in code like this:
> {noformat}
> "PRI IPC Server handler 4 on 60020" daemon prio=10 tid=0x00002aaac07e9000 nid=0x6566
runnable [0x000000004c46d000]
>    java.lang.Thread.State: RUNNABLE
>         at java.lang.String.indexOf(String.java:1521)
>         at java.net.URI$Parser.scan(URI.java:2912)
>         at java.net.URI$Parser.parse(URI.java:3004)
>         at java.net.URI.<init>(URI.java:736)
>         at org.apache.hadoop.fs.Path.initialize(Path.java:145)
>         at org.apache.hadoop.fs.Path.<init>(Path.java:126)
>         at org.apache.hadoop.fs.Path.<init>(Path.java:50)
>         at org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311)
>         at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159)
>         at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842)
>         at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867)
>         at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168)
>         at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269)
>         at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255)
>         at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368)
>         at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155)
>         at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807)
>         at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> {noformat}
> To open the region, the region server first loads the latest HTableDescriptor.  Since
HBASE-4553 HTableDescriptor's are stored in the file system at "/hbase/<tableDir>/.tableinfo.<sequenceNum>".
 The file with the largest sequenceNum is the current descriptor.  This is done so that the
current descirptor is updated atomically.  However, since the filename is not known in advance
FSTableDescriptors it has to do a FileSystem.listStatus operation which has to list all files
in the directory to find it.  The directory also contains all the region directories, so in
our case it has to load 130k FileStatus objects.  Even using a globStatus matching function
still transfers all the objects to the client before performing the pattern matching.  Furthermore
HDFS uses a default of transferring 1000 directory entries in each RPC call, so it requires
130 roundtrips to the namenode to fetch all the directory entries.
> Consequently, to reassign all the regions of a table (or a constant fraction thereof)
requires time proportional to the square of the number of regions.
> In our case, if a region server fails with 200 such regions, it takes 10+ minutes for
them all to be reassigned, after the zk expiration and log splitting.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message