From dev-return-49070-archive-asf-public=cust-asf.ponee.io@phoenix.apache.org Fri Feb 2 04:53:04 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 2DD9C180652 for ; Fri, 2 Feb 2018 04:53:04 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 1D1FE160C56; Fri, 2 Feb 2018 03:53:04 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 62BAE160C44 for ; Fri, 2 Feb 2018 04:53:03 +0100 (CET) Received: (qmail 53056 invoked by uid 500); 2 Feb 2018 03:53:02 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 53032 invoked by uid 99); 2 Feb 2018 03:53:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Feb 2018 03:53:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id D8F1EE841A for ; Fri, 2 Feb 2018 03:53:01 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.511 X-Spam-Level: X-Spam-Status: No, score=-109.511 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 3laXiyNkwWcA for ; Fri, 2 Feb 2018 03:53:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id E816C5F296 for ; Fri, 2 Feb 2018 03:53:00 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 5D5CCE012F for ; Fri, 2 Feb 2018 03:53:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 14DEF21E82 for ; Fri, 2 Feb 2018 03:53:00 +0000 (UTC) Date: Fri, 2 Feb 2018 03:53:00 +0000 (UTC) From: "jifei_yang (JIRA)" To: dev@phoenix.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (PHOENIX-4490) Phoenix Spark Module doesn't pass in user properties to create connection MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PHOENIX-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349747#comment-16349747 ] jifei_yang commented on PHOENIX-4490: ------------------------------------- So, it will work fine. > Phoenix Spark Module doesn't pass in user properties to create connection > ------------------------------------------------------------------------- > > Key: PHOENIX-4490 > URL: https://issues.apache.org/jira/browse/PHOENIX-4490 > Project: Phoenix > Issue Type: Bug > Reporter: Karan Mehta > Priority: Major > > Phoenix Spark module doesn't work perfectly in a Kerberos environment. This is because whenever new {{PhoenixRDD}} are built, they are always built with new and default properties. The following piece of code in {{PhoenixRelation}} is an example. This is the class used by spark to create {{BaseRelation}} before executing a scan. > {code} > new PhoenixRDD( > sqlContext.sparkContext, > tableName, > requiredColumns, > Some(buildFilter(filters)), > Some(zkUrl), > new Configuration(), > dateAsTimestamp > ).toDataFrame(sqlContext).rdd > {code} > This would work fine in most cases if the spark code is being run on the same cluster as HBase, the config object will pickup properties from Class path xml files. However in an external environment we should use the user provided properties and merge them before creating any {{PhoenixRelation}} or {{PhoenixRDD}}. As per my understanding, we should ideally provide properties in {{DefaultSource#createRelation() method}}. > An example of when this fails is, Spark tries to get the splits to optimize the MR performance for loading data in the table in {{PhoenixInputFormat#generateSplits()}} methods. Ideally, it should get all the config parameters from the {{JobContext}} being passed, but it is defaulted to {{new Configuration()}}, irrespective of what user passes in. Thus it fails to create a connection. > [~jmahonin] [~maghamravikiran@gmail.com] > Any ideas or advice? Let me know if I am missing anything obvious here. -- This message was sent by Atlassian JIRA (v7.6.3#76005)