From issues-return-168991-archive-asf-public=cust-asf.ponee.io@hive.apache.org Thu Oct 10 22:33:03 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 3273F180658 for ; Fri, 11 Oct 2019 00:33:03 +0200 (CEST) Received: (qmail 71429 invoked by uid 500); 10 Oct 2019 22:33:02 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 71268 invoked by uid 99); 10 Oct 2019 22:33:02 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Oct 2019 22:33:02 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 250F1E30CD for ; Thu, 10 Oct 2019 22:33:01 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 5C16878073B for ; Thu, 10 Oct 2019 22:33:00 +0000 (UTC) Date: Thu, 10 Oct 2019 22:33:00 +0000 (UTC) From: "ASF GitHub Bot (Jira)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Work logged] (HIVE-21344) CBO: Reduce compilation time in presence of materialized views MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-21344?focusedWorklogId=326610&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326610 ] ASF GitHub Bot logged work on HIVE-21344: ----------------------------------------- Author: ASF GitHub Bot Created on: 10/Oct/19 22:32 Start Date: 10/Oct/19 22:32 Worklog Time Spent: 10m Work Description: jcamachor commented on pull request #749: HIVE-21344 URL: https://github.com/apache/hive/pull/749#discussion_r333763093 ########## File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ########## @@ -1612,6 +1612,150 @@ public Table apply(org.apache.hadoop.hive.metastore.api.Table table) { } } + /** + * Get the materialized views that have been enabled for rewriting from the + * cache (registry). It will preprocess them to discard those that are + * outdated and augment those that need to be augmented, e.g., if incremental + * rewriting is enabled. + * + * @return the list of materialized views available for rewriting from the registry + * @throws HiveException + */ + public List getPreprocessedMaterializedViewsFromRegistry( + List tablesUsed, HiveTxnManager txnMgr) throws HiveException { + // From cache + List materializedViews = + HiveMaterializedViewsRegistry.get().getRewritingMaterializedViews(); + if (materializedViews.isEmpty()) { + // Bail out: empty list + return new ArrayList<>(); + } + // Add to final result + return filterAugmentMaterializedViews(materializedViews, tablesUsed, txnMgr); + } + + private List filterAugmentMaterializedViews(List materializedViews, + List tablesUsed, HiveTxnManager txnMgr) throws HiveException { + final String validTxnsList = conf.get(ValidTxnList.VALID_TXNS_KEY); + final ValidTxnWriteIdList currentTxnWriteIds = txnMgr.getValidWriteIds(tablesUsed, validTxnsList); + final boolean tryIncrementalRewriting = + HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_MATERIALIZED_VIEW_REWRITING_INCREMENTAL); + final long defaultTimeWindow = + HiveConf.getTimeVar(conf, HiveConf.ConfVars.HIVE_MATERIALIZED_VIEW_REWRITING_TIME_WINDOW, + TimeUnit.MILLISECONDS); + try { + // Final result + List result = new ArrayList<>(); + for (RelOptMaterialization materialization : materializedViews) { + final RelNode viewScan = materialization.tableRel; + final Table materializedViewTable; + if (viewScan instanceof Project) { + // There is a Project on top (due to nullability) + materializedViewTable = ((RelOptHiveTable) viewScan.getInput(0).getTable()).getHiveTableMD(); + } else { + materializedViewTable = ((RelOptHiveTable) viewScan.getTable()).getHiveTableMD(); + } + final Boolean outdated = isOutdatedMaterializedView(materializedViewTable, currentTxnWriteIds, + defaultTimeWindow, tablesUsed, false); + if (outdated == null) { + continue; + } + + final CreationMetadata creationMetadata = materializedViewTable.getCreationMetadata(); + if (outdated) { + // The MV is outdated, see whether we should consider it for rewriting or not + if (!tryIncrementalRewriting) { + LOG.debug("Materialized view " + materializedViewTable.getFullyQualifiedName() + + " ignored for rewriting as its contents are outdated"); + continue; + } + // We will rewrite it to include the filters on transaction list + // so we can produce partial rewritings. + // This would be costly since we are doing it for every materialized view + // that is outdated, but it only happens for more than one materialized view + // if rewriting with outdated materialized views is enabled (currently + // disabled by default). + materialization = augmentMaterializationWithTimeInformation( + materialization, validTxnsList, new ValidTxnWriteIdList( + creationMetadata.getValidTxnList())); + } + result.add(materialization); + } + return result; + } catch (Exception e) { + throw new HiveException(e); + } + } + + /** + * Validate that the materialized views retrieved from registry are still up-to-date. + * For those that are not, the method loads them from the metastore into the registry. + * + * @return true if they are up-to-date, otherwise false + * @throws HiveException + */ + public boolean validateMaterializedViewsFromRegistry(List cachedMaterializedViewTables, + List tablesUsed, HiveTxnManager txnMgr) throws HiveException { + final long defaultTimeWindow = + HiveConf.getTimeVar(conf, HiveConf.ConfVars.HIVE_MATERIALIZED_VIEW_REWRITING_TIME_WINDOW, + TimeUnit.MILLISECONDS); + final String validTxnsList = conf.get(ValidTxnList.VALID_TXNS_KEY); + final ValidTxnWriteIdList currentTxnWriteIds = txnMgr.getValidWriteIds(tablesUsed, validTxnsList); + try { + // Final result + boolean result = true; + for (Table cachedMaterializedViewTable : cachedMaterializedViewTables) { + // Retrieve the materialized view table from the metastore + final Table materializedViewTable = getTable( + cachedMaterializedViewTable.getDbName(), cachedMaterializedViewTable.getTableName()); + if (materializedViewTable == null || !materializedViewTable.isRewriteEnabled()) { + // This could happen if materialized view has been deleted or rewriting has been disabled. + // We remove it from the registry and set result to false. + HiveMaterializedViewsRegistry.get().dropMaterializedView(cachedMaterializedViewTable); + result = false; Review comment: We are not done. We do try to validate all materialized views that we were going to use and we reload from metastore those that are not up-to-date, hence we need to iterate through all that were introduced by the rewriting. However, if any of them is not up-to-date, we return `false` to cancel the rewriting. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 326610) Time Spent: 0.5h (was: 20m) > CBO: Reduce compilation time in presence of materialized views > -------------------------------------------------------------- > > Key: HIVE-21344 > URL: https://issues.apache.org/jira/browse/HIVE-21344 > Project: Hive > Issue Type: Bug > Components: Materialized views > Affects Versions: 4.0.0 > Reporter: Gopal Vijayaraghavan > Assignee: Jesus Camacho Rodriguez > Priority: Major > Labels: pull-request-available > Attachments: HIVE-21344.01.patch, HIVE-21344.02.patch, HIVE-21344.03.patch, HIVE-21344.04.patch, HIVE-21344.patch, calcite-planner-after-fix.svg.zip, mv-get-from-remote.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > For every query, {{getAllValidMaterializedViews}} still requires a call to metastore to verify that the materializations exist, whether they are outdated or not, etc. Since this is only useful for active-active HS2 deployments, we could take a less aggressive approach and check this information only after rewriting has been triggered. In addition, we could refresh the information in the HS2 registry periodically in a background thread. > {code} > // This is not a rebuild, we retrieve all the materializations. In turn, we do not need > // to force the materialization contents to be up-to-date, as this is not a rebuild, and > // we apply the user parameters (HIVE_MATERIALIZED_VIEW_REWRITING_TIME_WINDOW) instead. > materializations = db.getAllValidMaterializedViews(getTablesUsed(basePlan), false, getTxnMgr()); > {code} > !mv-get-from-remote.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)