impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4789: Fix slow metadata loading due to inconsistent paths.
Date Thu, 19 Jan 2017 16:03:29 GMT
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5743

Change subject: IMPALA-4789: Fix slow metadata loading due to inconsistent paths.
......................................................................

IMPALA-4789: Fix slow metadata loading due to inconsistent paths.

The fix for IMPALA-4172/IMPALA-3653 introduced a performance
regression for loading tables that have many partitions with:
1. inconsistent HDFS path qualification or
2. a custom location (not under the table root dir)

For the first issue consider a table whose root path is at
'hdfs://localhost:8020/warehouse/tbl/'.
A partition with an unqualified location '/warehouse/tbl/p=1'
will not be recognized as being a descendant of the table root
dir by FileSystemUtil.isDescendentPath() because of how
Path.equals() behaves, even if 'hdfs://localhost:8020' is the
default filesystem.
Such partitions are incorrectly recognized as having a custom
location and are treated specially. The treatment of such
partitions is very inefficient.

This patch fixes the detection of partitions with custom
locations, and improves the speed of loading partitions
with custom locations.

Change-Id: I8c881b7cb155032b82fba0e29350ca31de388d55
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
2 files changed, 35 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/5743/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5743
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8c881b7cb155032b82fba0e29350ca31de388d55
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <alex.behm@cloudera.com>

Mime
View raw message