Return-Path: X-Original-To: apmail-hive-issues-archive@minotaur.apache.org Delivered-To: apmail-hive-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 321F318819 for ; Tue, 10 Nov 2015 13:58:11 +0000 (UTC) Received: (qmail 76372 invoked by uid 500); 10 Nov 2015 13:58:11 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 76349 invoked by uid 500); 10 Nov 2015 13:58:11 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 76328 invoked by uid 99); 10 Nov 2015 13:58:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Nov 2015 13:58:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id E5DB72C14F8 for ; Tue, 10 Nov 2015 13:58:10 +0000 (UTC) Date: Tue, 10 Nov 2015 13:58:10 +0000 (UTC) From: "Xuefu Zhang (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-12370) Hive Query got failure with larger scale data set with enablng sampling order optimization MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-12370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14998612#comment-14998612 ] Xuefu Zhang commented on HIVE-12370: ------------------------------------ Could you try your case with other data formats, such as text, sequence file, or parquet? > Hive Query got failure with larger scale data set with enablng sampling order optimization > ------------------------------------------------------------------------------------------ > > Key: HIVE-12370 > URL: https://issues.apache.org/jira/browse/HIVE-12370 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 1.1.0 > Reporter: Yi Zhou > > Found that hive would get failure on Hive on MR with larger scale data(e.g.,3TB/10TB) when enabling sampling optimization(it got passed with 1GB data set). > hive.optimize.sampling.orderby=true > hive.optimize.sampling.orderby.number=20000 > hive.optimize.sampling.orderby.percent=0.1 > Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row > at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52) > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) > ... 8 more > Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.io.orc.OrcStruct cannot be cast to org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch > at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:121) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) > at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) > at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) > at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) > ... 9 more > FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask -- This message was sent by Atlassian JIRA (v6.3.4#6332)