Return-Path: X-Original-To: apmail-hive-issues-archive@minotaur.apache.org Delivered-To: apmail-hive-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C8023186F6 for ; Thu, 4 Jun 2015 07:06:38 +0000 (UTC) Received: (qmail 61884 invoked by uid 500); 4 Jun 2015 07:06:38 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 61861 invoked by uid 500); 4 Jun 2015 07:06:38 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 61851 invoked by uid 99); 4 Jun 2015 07:06:38 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jun 2015 07:06:38 +0000 Date: Thu, 4 Jun 2015 07:06:38 +0000 (UTC) From: "Lefty Leverenz (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-10885) with vectorization enabled join operation involving interval_day_time fails MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14572292#comment-14572292 ] Lefty Leverenz commented on HIVE-10885: --------------------------------------- Note: The commits have the wrong Jira number -- they say HIVE-10855 instead of HIVE-10885. * Commit to master: 09100831adff7589ee48e735a4beac6ebb25cb3e * Commit to branch-1.2: f3ab5fda6af57afff31c29ad048d906fd095d5fb > with vectorization enabled join operation involving interval_day_time fails > --------------------------------------------------------------------------- > > Key: HIVE-10885 > URL: https://issues.apache.org/jira/browse/HIVE-10885 > Project: Hive > Issue Type: Bug > Affects Versions: 1.2.0 > Reporter: Jagruti Varia > Assignee: Matt McCline > Fix For: 1.2.1 > > Attachments: HIVE-10885.01.patch, HIVE-10885.02.patch, HIVE-10885.03.patch > > > When vectorization is on, join operation involving interval_day_time type throws following error: > {noformat} > Status: Failed > Vertex failed, vertexName=Map 2, vertexId=vertex_1432858236614_0247_1_01, diagnostics=[Task failed, taskId=task_1432858236614_0247_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Map operator initialization failed > at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > ... 14 more > Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for interval_day_time > at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213) > at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) > at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) > at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214) > ... 15 more > ], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Map operator initialization failed > at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > ... 14 more > Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for interval_day_time > at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213) > at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) > at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) > at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214) > ... 15 more > ], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Map operator initialization failed > at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > ... 14 more > Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for interval_day_time > at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213) > at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) > at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) > at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214) > ... 15 more > ], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) > at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) > at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) > at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Map operator initialization failed > at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229) > at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) > ... 14 more > Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for interval_day_time > at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213) > at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) > at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) > at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) > at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214) > ... 15 more > {noformat} > query ran: > {noformat} > select > v1.s, > v2.s, > v1.intrvl1 > from > ( select > s, > (cast(dt as date) - cast(ts as date)) as intrvl1 > from > vectortab10korc ) v1 > join > ( > select > s , > (cast(dt as date) - cast(ts as date)) as intrvl2 > from > vectorparttab10korc > ) v2 > on v1.intrvl1 = v2.intrvl2 > and v1.s = v2.s; > {noformat} > explain plan: > {noformat} > OK > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Map 2 <- Map 1 (BROADCAST_EDGE) > DagName: hrt_qa_20150601024305_7745bc8f-169f-45c6-8856-7391eef0d819:3 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: vectortab10korc > filterExpr: s is not null (type: boolean) > Statistics: Num rows: 10000 Data size: 4597592 Basic stats: COMPLETE Column stats: PARTIAL > Filter Operator > predicate: s is not null (type: boolean) > Statistics: Num rows: 10000 Data size: 1340000 Basic stats: COMPLETE Column stats: PARTIAL > Select Operator > expressions: s (type: string), (dt - CAST( ts AS DATE)) (type: interval_day_time) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 10000 Data size: 940000 Basic stats: COMPLETE Column stats: PARTIAL > Filter Operator > predicate: _col1 is not null (type: boolean) > Statistics: Num rows: 10000 Data size: 940000 Basic stats: COMPLETE Column stats: PARTIAL > Reduce Output Operator > key expressions: _col1 (type: interval_day_time), _col0 (type: string) > sort order: ++ > Map-reduce partition columns: _col1 (type: interval_day_time), _col0 (type: string) > Statistics: Num rows: 10000 Data size: 940000 Basic stats: COMPLETE Column stats: PARTIAL > Select Operator > expressions: _col0 (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 10000 Data size: 940000 Basic stats: COMPLETE Column stats: PARTIAL > Group By Operator > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5000 Data size: 470000 Basic stats: COMPLETE Column stats: PARTIAL > Dynamic Partitioning Event Operator > Target Input: vectorparttab10korc > Partition key expr: s > Statistics: Num rows: 5000 Data size: 470000 Basic stats: COMPLETE Column stats: PARTIAL > Target column: s > Target Vertex: Map 2 > Execution mode: vectorized > Map 2 > Map Operator Tree: > TableScan > alias: vectorparttab10korc > filterExpr: s is not null (type: boolean) > Statistics: Num rows: 10000 Data size: 3656191 Basic stats: COMPLETE Column stats: PARTIAL > Select Operator > expressions: s (type: string), (dt - CAST( ts AS DATE)) (type: interval_day_time) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 10000 Data size: 1840000 Basic stats: COMPLETE Column stats: PARTIAL > Filter Operator > predicate: _col1 is not null (type: boolean) > Statistics: Num rows: 10000 Data size: 1840000 Basic stats: COMPLETE Column stats: PARTIAL > Map Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col1 (type: interval_day_time), _col0 (type: string) > 1 _col1 (type: interval_day_time), _col0 (type: string) > outputColumnNames: _col0, _col1, _col2 > input vertices: > 0 Map 1 > Statistics: Num rows: 344 Data size: 95632 Basic stats: COMPLETE Column stats: PARTIAL > HybridGraceHashJoin: true > Select Operator > expressions: _col0 (type: string), _col2 (type: string), _col1 (type: interval_day_time) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 344 Data size: 95632 Basic stats: COMPLETE Column stats: PARTIAL > File Output Operator > compressed: false > Statistics: Num rows: 344 Data size: 95632 Basic stats: COMPLETE Column stats: PARTIAL > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Execution mode: vectorized > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > Time taken: 0.402 seconds, Fetched: 91 row(s) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)