Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 50C72200B44 for ; Thu, 30 Jun 2016 00:48:53 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 4F5E3160A6F; Wed, 29 Jun 2016 22:48:53 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 98E2D160A57 for ; Thu, 30 Jun 2016 00:48:52 +0200 (CEST) Received: (qmail 43390 invoked by uid 500); 29 Jun 2016 22:48:51 -0000 Mailing-List: contact commits-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list commits@airflow.incubator.apache.org Received: (qmail 43381 invoked by uid 99); 29 Jun 2016 22:48:51 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Jun 2016 22:48:51 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 51FFC1A5232 for ; Wed, 29 Jun 2016 22:48:51 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.646 X-Spam-Level: X-Spam-Status: No, score=-4.646 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id ZtV7GNtVYLaf for ; Wed, 29 Jun 2016 22:48:50 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id D58835F1E3 for ; Wed, 29 Jun 2016 22:48:49 +0000 (UTC) Received: (qmail 43213 invoked by uid 99); 29 Jun 2016 22:48:48 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Jun 2016 22:48:48 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id CDC612C029F for ; Wed, 29 Jun 2016 22:48:48 +0000 (UTC) Date: Wed, 29 Jun 2016 22:48:48 +0000 (UTC) From: "ASF subversion and git services (JIRA)" To: commits@airflow.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (AIRFLOW-243) Use a more efficient Thrift call for HivePartitionSensor MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 29 Jun 2016 22:48:53 -0000 [ https://issues.apache.org/jira/browse/AIRFLOW-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15355928#comment-15355928 ] ASF subversion and git services commented on AIRFLOW-243: --------------------------------------------------------- Commit bf28de4e601c165020669fd593964187b6246131 in incubator-airflow's branch refs/heads/master from [~xuanji] [ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=bf28de4 ] [AIRFLOW-243] Create NamedHivePartitionSensor Closes #1593 from zodiac/create-NamedHivePartitionSensor > Use a more efficient Thrift call for HivePartitionSensor > -------------------------------------------------------- > > Key: AIRFLOW-243 > URL: https://issues.apache.org/jira/browse/AIRFLOW-243 > Project: Apache Airflow > Issue Type: Improvement > Components: operators > Affects Versions: Airflow 2.0 > Reporter: Paul Yang > Assignee: Li Xuanji > Priority: Minor > Fix For: Airflow 2.0 > > > The {{HivePartitionSesnor}} uses the `get_partitions_by_filter` Thrift call that can result in some expensive SQL queries for tables that have many partitions and are partitioned by multiple keys. We've seen our metastore DB get hammered by these sensors resulting in service degradation for other metastore users. > The {{MetastorePartitionSensor}} is efficient, but it can result in too many connections to the metastore DB. > An alternative is to use the `get_partition_by_name` Thrift call that translates into more efficient SQL queries. Because connections will be pooled on the Thrift server, the DB won't get overloaded as with the {{MetastorePartitionSensor}}. The semantics of the arguments will change, so either a new argument needs to be introduced, or a new operator needs to be created. -- This message was sent by Atlassian JIRA (v6.3.4#6332)