hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <>
Subject [jira] [Commented] (HIVE-8509) UT: fix list_bucket_dml_2 test
Date Tue, 04 Nov 2014 19:25:34 GMT


Hive QA commented on HIVE-8509:

{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 7099 tests executed
*Failed tests:*

Test results:
Console output:
Test logs:

Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed

This message is automatically generated.

ATTACHMENT ID: 12679259 - PreCommit-HIVE-SPARK-Build

> UT: fix list_bucket_dml_2 test
> ------------------------------
>                 Key: HIVE-8509
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Thomas Friedrich
>            Assignee: Chinna Rao Lalam
>            Priority: Minor
>         Attachments: HIVE-8509-spark.patch
> The test list_bucket_dml_2 fails in FileSinkOperator.publishStats:
> org.apache.hadoop.hive.ql.metadata.HiveException: [Error 30002]: StatsPublisher cannot
be connected to.There was a error while connecting to the StatsPublisher, and retrying might
help. If you dont want the query to fail because accurate statistics could not be collected,
set hive.stats.reliable=false
> at org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(
> at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(
> at org.apache.hadoop.hive.ql.exec.Operator.close(
> at org.apache.hadoop.hive.ql.exec.Operator.close(
> at org.apache.hadoop.hive.ql.exec.Operator.close(
> at org.apache.hadoop.hive.ql.exec.Operator.close(
> at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(
> at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(
> at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(
> I debugged and found that FileSinkOperator.publishStats throws the exception when calling
statsPublisher.connect here:
> if (!statsPublisher.connect(hconf)) {
> // just return, stats gathering should not block the main query
> LOG.error("StatsPublishing error: cannot connect to database");
> if (isStatsReliable)
> { throw new HiveException(ErrorMsg.STATSPUBLISHER_CONNECTION_ERROR.getErrorCodedMsg());
> return;
> }
> With the hive.stats.dbclass set to counter in data/conf/spark/hive-site.xml, the statsPuvlisher
is of type CounterStatsPublisher.
> In CounterStatsPublisher, the exception is thrown because getReporter() returns null
for the MapredContext:
> MapredContext context = MapredContext.get();
> if (context == null || context.getReporter() == null)
> { return false; }
> When changing hive.stats.dbclass to jdbc:derby in data/conf/spark/hive-site.xml, similar
to TestCliDriver it works:
> <property>
> <name>hive.stats.dbclass</name>
> <!-- <value>counter</value> -->
> <value>jdbc:derby</value>
> <description>The default storatge that stores temporary hive statistics. Currently,
jdbc, hbase and counter type is supported</description>
> </property>
> In addition, I had to generate the out file for the test case for spark.
> When running this test with TestCliDriver and hive.stats.dbclass set to counter, the
test case still works. The reporter is set to org.apache.hadoop.mapred.Task$TaskReporter.

> Might need some additional investigation why the CounterStatsPublisher has no reporter
in case of spark.

This message was sent by Atlassian JIRA

View raw message