Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8B1E9200BB4 for ; Tue, 1 Nov 2016 18:21:00 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 8995C160AF7; Tue, 1 Nov 2016 17:21:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id CFF50160ADA for ; Tue, 1 Nov 2016 18:20:59 +0100 (CET) Received: (qmail 16400 invoked by uid 500); 1 Nov 2016 17:20:58 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 16227 invoked by uid 99); 1 Nov 2016 17:20:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Nov 2016 17:20:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 543862C001E for ; Tue, 1 Nov 2016 17:20:58 +0000 (UTC) Date: Tue, 1 Nov 2016 17:20:58 +0000 (UTC) From: "Premal Shah (JIRA)" To: dev@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HIVE-15105) Hive shell runs out of memory on Tez MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 01 Nov 2016 17:21:00 -0000 Premal Shah created HIVE-15105: ---------------------------------- Summary: Hive shell runs out of memory on Tez Key: HIVE-15105 URL: https://issues.apache.org/jira/browse/HIVE-15105 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 2.0.1 Reporter: Premal Shah Hive 2.0.1 Hadoop 2.7.2 Tex 0.8.4 We have a UDF in hive which take in some values and outputs a score. When running a query on a table which calls the score function on every row, looks like tez is not running the query on YARN, but trying to run it in local mode. It then runs out of memory trying to insert that data into a table. Here's the query ADD JAR score.jar; CREATE TEMPORARY FUNCTION score AS 'hive.udf.ScoreUDF'; CREATE TABLE abc AS SELECT id, score(col1, col2) as score , '2016-10-11' AS dt FROM input_table ; Here's the output of the shell Query ID = hadoop_20161028232841_5a06db96-ffaa-4e75-a657-c7cb46ccb3f5 Total jobs = 1 Launching Job 1 out of 1 java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3332) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137) at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:622) at java.lang.StringBuilder.append(StringBuilder.java:202) at com.google.protobuf.TextFormat.escapeBytes(TextFormat.java:1283) at com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:394) at com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) at com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:286) at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273) at com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:404) at com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) at com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:286) at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273) at com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:404) at com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) at com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:286) at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273) at com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:404) at com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) at com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:283) at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273) at com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:404) at com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) at com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:283) at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273) at com.google.protobuf.TextFormat$Printer.printFieldValue(TextFormat.java:404) at com.google.protobuf.TextFormat$Printer.printSingleField(TextFormat.java:327) at com.google.protobuf.TextFormat$Printer.printField(TextFormat.java:286) at com.google.protobuf.TextFormat$Printer.print(TextFormat.java:273) at com.google.protobuf.TextFormat$Printer.access$400(TextFormat.java:248) at com.google.protobuf.TextFormat.shortDebugString(TextFormat.java:88) FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Java heap space It looks like the job is not getting submitted to the cluster, but running locally. We can't get tez to run the query on the cluster. The hive shell starts with an Xmx of 4G. If I set hive.execution.engine = mr, then the query works, because it runs on the hadoop cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)