Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 26248165E0B for ; Tue, 25 Jul 2017 02:11:03 +0200 (CEST) Received: (qmail 12764 invoked by uid 500); 25 Jul 2017 00:11:03 -0000 Mailing-List: contact issues-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@impala.incubator.apache.org Delivered-To: mailing list issues@impala.incubator.apache.org Received: (qmail 12749 invoked by uid 99); 25 Jul 2017 00:11:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Jul 2017 00:11:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id C78EE180790 for ; Tue, 25 Jul 2017 00:11:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id lkHGvrWvzJRT for ; Tue, 25 Jul 2017 00:11:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 3166B5F19C for ; Tue, 25 Jul 2017 00:11:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 750B0E0280 for ; Tue, 25 Jul 2017 00:11:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 2D07521ED9 for ; Tue, 25 Jul 2017 00:11:00 +0000 (UTC) Date: Tue, 25 Jul 2017 00:11:00 +0000 (UTC) From: "Henry Robinson (JIRA)" To: issues@impala.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (IMPALA-4905) Fragments always report insert status, even if not insert query MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/IMPALA-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henry Robinson resolved IMPALA-4905. ------------------------------------ Resolution: Fixed Fix Version/s: Impala 2.10.0 https://github.com/apache/incubator-impala/commit/d25db64f0e17092af9ef60eb37ec9214900c2d1c > Fragments always report insert status, even if not insert query > --------------------------------------------------------------- > > Key: IMPALA-4905 > URL: https://issues.apache.org/jira/browse/IMPALA-4905 > Project: IMPALA > Issue Type: Sub-task > Components: Distributed Exec > Affects Versions: Impala 2.7.0 > Reporter: Henry Robinson > Assignee: Henry Robinson > Fix For: Impala 2.10.0 > > > {code} > if (done) { > TInsertExecStatus insert_status; > if (runtime_state->hdfs_files_to_move()->size() > 0) { > insert_status.__set_files_to_move(*runtime_state->hdfs_files_to_move()); > } > if (runtime_state->per_partition_status()->size() > 0) { > insert_status.__set_per_partition_status(*runtime_state->per_partition_status()); > } > params.__set_insert_exec_status(insert_status); > } > {code} > This means that any fragment will always set {{insert_exec_status}} in its response, even if it's not an INSERT query. > However, in the RPC handler, {{Coordinator::UpdateFragmentExecStatus()}}, we have: > {code} > if (params.done && params.__isset.insert_exec_status) { > lock_guard l(lock_); > // Merge in table update data (partitions written to, files to be moved as part of > // finalization) > for (const PartitionStatusMap::value_type& partition: > params.insert_exec_status.per_partition_status) { > // etc > {code} > which means that the RPC will always try and take the query exec state lock, for every 'done' report. With lots of fragment instances, this can lead to some severe serialisation of reports when the query finishes. > The simplest workaround is not to set {{insert_exec_status}} for {{SELECT}} queries. But a better solution (that will help INSERTs as well) is not to try and do the merge here, but instead in {{Coordinator::FinalizeSuccessfulInsert()}}, saving the {{TInsertExecStatus}} in the fragment instance state until that point. -- This message was sent by Atlassian JIRA (v6.4.14#64029)