Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 56C831792A for ; Thu, 6 Nov 2014 04:33:34 +0000 (UTC) Received: (qmail 12629 invoked by uid 500); 6 Nov 2014 04:33:33 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 12553 invoked by uid 500); 6 Nov 2014 04:33:33 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 12540 invoked by uid 500); 6 Nov 2014 04:33:33 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 12537 invoked by uid 99); 6 Nov 2014 04:33:33 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Nov 2014 04:33:33 +0000 Date: Thu, 6 Nov 2014 04:33:33 +0000 (UTC) From: "Xuefu Zhang (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HIVE-8509) UT: fix list_bucket_dml_2 test MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-8509: ------------------------------ Attachment: HIVE-8509-spark.patch Reattach the same patch to rerun the task, as many test failures seemed unrelated. > UT: fix list_bucket_dml_2 test > ------------------------------ > > Key: HIVE-8509 > URL: https://issues.apache.org/jira/browse/HIVE-8509 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Thomas Friedrich > Assignee: Chinna Rao Lalam > Priority: Minor > Attachments: HIVE-8509-spark.patch, HIVE-8509-spark.patch > > > The test list_bucket_dml_2 fails in FileSinkOperator.publishStats: > org.apache.hadoop.hive.ql.metadata.HiveException: [Error 30002]: StatsPublisher cannot be connected to.There was a error while connecting to the StatsPublisher, and retrying might help. If you dont want the query to fail because accurate statistics could not be collected, set hive.stats.reliable=false > at org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(FileSinkOperator.java:1079) > at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:971) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:582) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594) > at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:175) > at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57) > at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:121) > I debugged and found that FileSinkOperator.publishStats throws the exception when calling statsPublisher.connect here: > if (!statsPublisher.connect(hconf)) { > // just return, stats gathering should not block the main query > LOG.error("StatsPublishing error: cannot connect to database"); > if (isStatsReliable) > { throw new HiveException(ErrorMsg.STATSPUBLISHER_CONNECTION_ERROR.getErrorCodedMsg()); } > return; > } > With the hive.stats.dbclass set to counter in data/conf/spark/hive-site.xml, the statsPuvlisher is of type CounterStatsPublisher. > In CounterStatsPublisher, the exception is thrown because getReporter() returns null for the MapredContext: > MapredContext context = MapredContext.get(); > if (context == null || context.getReporter() == null) > { return false; } > When changing hive.stats.dbclass to jdbc:derby in data/conf/spark/hive-site.xml, similar to TestCliDriver it works: > > hive.stats.dbclass > > jdbc:derby > The default storatge that stores temporary hive statistics. Currently, jdbc, hbase and counter type is supported > > In addition, I had to generate the out file for the test case for spark. > When running this test with TestCliDriver and hive.stats.dbclass set to counter, the test case still works. The reporter is set to org.apache.hadoop.mapred.Task$TaskReporter. > Might need some additional investigation why the CounterStatsPublisher has no reporter in case of spark. -- This message was sent by Atlassian JIRA (v6.3.4#6332)