impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikramjeet Vig (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-1683: Allow REFRESH on a single partition
Date Thu, 28 Jul 2016 01:45:19 GMT
Hello David Knupp, Dimitris Tsirogiannis,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/3385

to look at the new patch set (#10).

Change subject: IMPALA-1683: Allow REFRESH on a single partition
......................................................................

IMPALA-1683: Allow REFRESH on a single partition

Currently the only way to refresh metadata for a partition was to refresh
the whole table. This is a relatively time consuming process especially if
there are many partitions and only one is to be refreshed.
This patch allows the client to REFRESH on a single partition by using the
following syntax:
REFRESH [database_name.]table_name PARTITION (partition_spec)

Testing:
Added parsing and authorization tests in ParserTest.java and
AuthorizationTest.java respectively. A new test file
"test_refresh_partition.py" was added for testing functionality.

Performance:
For a table with 10000 partitions and 1 file per partition

                     execResetMetadata()       Total Execution Time

Refresh Table              3795 ms                   4630 ms

Refersh Partition            42 ms                    680 ms

We see that the time to refresh improves by a factor of 90x but due to
significant overhead of about 640ms in this case the effective improvement
is about 7x. As the size of the table and number of partitions increase.
This improvement would be more significant.

Change-Id: I4bf5dd630f2e175be5800187bbf9337c91f82bd8
---
M common/thrift/CatalogService.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/com/cloudera/impala/analysis/AnalysisContext.java
M fe/src/main/java/com/cloudera/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java
M fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java
M fe/src/test/java/com/cloudera/impala/analysis/AnalyzerTest.java
M fe/src/test/java/com/cloudera/impala/analysis/AuthorizationTest.java
M fe/src/test/java/com/cloudera/impala/analysis/ParserTest.java
A tests/metadata/test_refresh_partition.py
11 files changed, 405 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/85/3385/10
-- 
To view, visit http://gerrit.cloudera.org:8080/3385
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4bf5dd630f2e175be5800187bbf9337c91f82bd8
Gerrit-PatchSet: 10
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Bikramjeet Vig <bikramjeet.vig@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet.vig@cloudera.com>
Gerrit-Reviewer: David Knupp <dknupp@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Michael Brown <mikeb@cloudera.com>

Mime
View raw message