Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4D66017373 for ; Tue, 17 Feb 2015 01:28:13 +0000 (UTC) Received: (qmail 28939 invoked by uid 500); 17 Feb 2015 01:28:12 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 28868 invoked by uid 500); 17 Feb 2015 01:28:12 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 28856 invoked by uid 500); 17 Feb 2015 01:28:12 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 28853 invoked by uid 99); 17 Feb 2015 01:28:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Feb 2015 01:28:12 +0000 Date: Tue, 17 Feb 2015 01:28:12 +0000 (UTC) From: "Jimmy Xiang (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-9659) 'Error while trying to create table container' occurs during hive query case execution when hive.optimize.skewjoin set to 'true' [Spark Branch] MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-9659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323537#comment-14323537 ] Jimmy Xiang commented on HIVE-9659: ----------------------------------- I just ran the same query with just a small data set with skew join enabled. > 'Error while trying to create table container' occurs during hive query case execution when hive.optimize.skewjoin set to 'true' [Spark Branch] > ----------------------------------------------------------------------------------------------------------------------------------------------- > > Key: HIVE-9659 > URL: https://issues.apache.org/jira/browse/HIVE-9659 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Xin Hao > > We found that 'Error while trying to create table container' occurs during Big-Bench Q12 case execution when hive.optimize.skewjoin set to 'true'. > If hive.optimize.skewjoin set to 'false', the case could pass. > How to reproduce: > 1. set hive.optimize.skewjoin=true; > 2. Run BigBench case Q12 and it will fail. > Check the executor log (e.g. /usr/lib/spark/work/app-XXXX/2/stderr) and you will found error 'Error while trying to create table container' in the log and also a NullPointerException near the end of the log. > (a) Detail error message for 'Error while trying to create table container': > {noformat} > 15/02/12 01:29:49 ERROR SparkMapRecordHandler: Error processing row: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to create table container > org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to create table container > at org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:118) > at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:193) > at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:219) > at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1051) > at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055) > at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055) > at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1055) > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:486) > at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:141) > at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47) > at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27) > at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98) > at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:217) > at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error while trying to create table container > at org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.load(MapJoinTableContainerSerDe.java:158) > at org.apache.hadoop.hive.ql.exec.spark.HashTableLoader.load(HashTableLoader.java:115) > ... 21 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error, not a directory: hdfs://bhx1:8020/tmp/hive/root/d22ef465-bff5-4edb-a822-0a9f1c25b66c/hive_2015-02-12_01-28-10_008_6897031694580088767-1/-mr-10009/HashTable-Stage-6/MapJoin-mapfile01--.hashtable > at org.apache.hadoop.hive.ql.exec.persistence.MapJoinTableContainerSerDe.load(MapJoinTableContainerSerDe.java:106) > ... 22 more > 15/02/12 01:29:49 INFO SparkRecordHandler: maximum memory = 40939028480 > 15/02/12 01:29:49 INFO PerfLogger: > {noformat} > (b) Detail error message for NullPointerException: > {noformat} > 5/02/12 01:29:50 ERROR MapJoinOperator: Unexpected exception: null > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.exec.MapJoinOperator.setMapJoinKey(MapJoinOperator.java:227) > at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:271) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) > at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:120) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) > at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) > at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:493) > at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:141) > at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47) > at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27) > at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98) > at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:217) > at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 15/02/12 01:29:50 INFO Executor: Executor is trying to kill task 144.2 in stage 3.0 (TID 1500) > 15/02/12 01:29:50 INFO MapOperator: Initializing Self MAP[1800] > 15/02/12 01:29:50 ERROR SparkMapRecordHandler: Error processing row: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"wcs_click_date_sk":37793,"wcs_click_time_sk":null,"wcs_sales_sk":null,"wcs_item_sk":51402,"wcs_web_page_sk":null,"wcs_user_sk":2541920} > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"wcs_click_date_sk":37793,"wcs_click_time_sk":null,"wcs_sales_sk":null,"wcs_item_sk":51402,"wcs_web_page_sk":null,"wcs_user_sk":2541920} > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:503) > at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:141) > at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47) > at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27) > at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98) > at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:217) > at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: null > at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:314) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) > at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:120) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) > at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) > at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:493) > ... 14 more > Caused by: java.lang.NullPointerException > at org.apache.hadoop.hive.ql.exec.MapJoinOperator.setMapJoinKey(MapJoinOperator.java:227) > at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:271) > ... 20 more > 15/02/12 01:29:50 INFO MapOperator: MAP[1797]: records read - 1 > 15/02/12 01:29:50 INFO Executor: Executor is trying to kill task 96.3 in stage 3.0 (TID 1515) > 15/02/12 01:29:50 INFO PerfLogger: > 15/02/12 01:29:50 INFO MapOperator: Initialization Done 1800 MAP > 15/02/12 01:29:50 INFO SparkRecordHandler: processing 1 rows: used memory = 12023782616 > 15/02/12 01:29:50 ERROR Executor: Exception in task 16.2 in stage 3.0 (TID 1488) > java.lang.RuntimeException: Error processing row: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"wcs_click_date_sk":37793,"wcs_click_time_sk":null,"wcs_sales_sk":null,"wcs_item_sk":51402,"wcs_web_page_sk":null,"wcs_user_sk":2541920} > at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:153) > at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:47) > at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.processNextRecord(HiveMapFunctionResultList.java:27) > at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:98) > at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41) > at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:217) > at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"wcs_click_date_sk":37793,"wcs_click_time_sk":null,"wcs_sales_sk":null,"wcs_item_sk":51402,"wcs_web_page_sk":null,"wcs_user_sk":2541920} > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:503) > at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:141) > ... 13 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected exception: null > at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:314) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) > at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:120) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) > at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) > at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:493) > ... 14 more > Caused by: java.lang.NullPointerException > at org.apache.hadoop.hive.ql.exec.MapJoinOperator.setMapJoinKey(MapJoinOperator.java:227) > at org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:271) > ... 20 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)