impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mostafa Mokhtar (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
Date Sun, 27 Nov 2016 05:32:44 GMT
Mostafa Mokhtar has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
......................................................................


Patch Set 4:

Just tried out the latest patch and metadata loading is 5.4x faster. 

With the patch metadata loading for 80 partitions with 250K files finished in 27 seconds compared
to 146 seconds without. 

Most of the CPU time is spent in the RemoteIterator, to further speedup metadata loading I
recommend using a thread pool. 

Stack Trace	Sample Count	Percentage(%)
org.apache.impala.catalog.HdfsTable.load(boolean, IMetaStoreClient, Table)	509	74.307
   org.apache.impala.catalog.HdfsTable.load(boolean, IMetaStoreClient, Table, boolean, boolean,
Set)	509	74.307
      org.apache.impala.catalog.HdfsTable.loadAllPartitions(List, Table)	507	74.015
         org.apache.impala.catalog.HdfsTable.loadMetadataAndDiskIds(FileSystem, List, HashMap)
497	72.555
            org.apache.impala.catalog.HdfsTable.loadBlockMetadata(FileSystem, Path, HashMap,
Map)	472	68.905
               org.apache.hadoop.fs.FileSystem$5.hasNext()	365	53.285
                  org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNext()
339	49.489
                     org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.hasNextNoFilter()
258	37.664
                        org.apache.hadoop.hdfs.DFSClient.listPaths(String, byte[], boolean)
258	37.664
                                                com.sun.proxy.$Proxy21.getListing(String,
byte[], boolean)	258	37.664
                              org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Object,
Method, Object[])	258	37.664
                                 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(Method,
Object[])	258	37.664
                                                         java.lang.reflect.Method.invoke(Object,
Object[])	258	37.664
                     org.apache.hadoop.hdfs.protocol.HdfsLocatedFileStatus.makeQualifiedLocated(URI,
Path)	81	11.825

-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-HasComments: No

Mime
View raw message