hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vaibhav Gumashta <vgumas...@hortonworks.com>
Subject Re: Tez query failed with OutOfMemoryError: Java heap space
Date Wed, 12 Jul 2017 00:09:22 GMT
Hi Xin,

Can you provide these:

  1.  Output of explain plan
  2.  Output of set –v (this will list the configs, so you might want to anonymize these)

In addition to that, it looks like vertex vertex_1495595408051_21107_2_03 failed with OOM.
Using Tez counters you can find out the amount of data input to this vertex which can further
help you in narrowing down the root cause.

Hope this helps,
—Vaibhav

From: <Yang>, Xin <xiyang@visa.com<mailto:xiyang@visa.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Thursday, July 6, 2017 at 10:37 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Re: Tez query failed with OutOfMemoryError: Java heap space

Here're the version information:

Hive: 1.2.1
Tez: 0.8.5
Hadoop 2.6.0-cdh5.8.3

Please let me know if you need more information.

Regards,
Xin

From: "Yang, Xin" <xiyang@visa.com<mailto:xiyang@visa.com>>
Date: Thursday, June 29, 2017 at 11:48 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Tez query failed with OutOfMemoryError: Java heap space

Hi,

We ran a Tez query and it failed with OOM. Then, we computed stats, it still failed with the
OOM.

Settings:

set hive.tez.container.size=4096;
set tez.am.resource.memory.mb=1024;
set hive.tez.java.opts=-Xmx3276m;

set hive.tez.dynamic.partition.pruning=false;
set hive.tez.dynamic.partition.pruning.max.event.size=1048576;
set hive.tez.dynamic.partition.pruning.max.data.size=104857600;

set hive.prewarm.enabled=true;
set hive.prewarm.numcontainers=10;

set tez.am.container.reuse.enabled=true;

set hive.cbo.enable=true;
set hive.compute.query.using.stats=true;
set hive.stats.fetch.column.stats=true;
set hive.stats.fetch.partition.stats=true;

set hive.auto.convert.join=true;
set hive.auto.convert.join.noconditionaltask=true;
set hive.auto.convert.join.noconditionaltask.size=20971520;
set hive.mapjoin.hybridgrace.hashtable=false;
set hive.optimize.bucketmapjoin.sortedmerge=false;
set hive.map.aggr.hash.percentmemory=0.5;
set hive.map.aggr=true;

set hive.vectorized.execution.enabled=false;
set hive.vectorized.execution.reduce.enabled=false;
set hive.vectorized.execution.reduce.groupby.enabled=false;

set hive.exec.parallel=true;
set hive.exec.parallel.thread.number=16;

set hive.exec.reducers.max=800;
set hive.optimize.reducededuplication=true;
set hive.optimize.reducededuplication.min.reducer=4;

set hive.merge.mapfiles=true;
set hive.merge.mapredfiles=false;
set hive.merge.smallfiles.avgsize=16000000;
set hive.merge.size.per.task=256000000;
set hive.smbjoin.cache.rows=10000;
set hive.fetch.task.conversion=more;
set hive.optimize.sort.dynamic.partition=true;

set hive.tez.auto.reducer.parallelism=true;

Stacktrace:

Status: Failed
Vertex failed, vertexName=Map 3, vertexId=vertex_1495595408051_21107_2_03, diagnostics=[Task
failed, taskId=task_1495595408051_21107_2_03_000000, diagnostics=[TaskAttempt 0 failed, info=[Error:
exceptio
nThrown=java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)
        at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)
        at org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput.<init>(MemoryFetchedInput.java:38)
        at org.apache.tez.runtime.library.common.shuffle.impl.SimpleFetchedInputAllocator.allocate(SimpleFetchedInputAllocator.java:141)
        at org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:717)
        at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:489)
        at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:398)
        at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:195)
        at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:70)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
, errorMessage=Fetch failed:java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:56)
        at org.apache.hadoop.io.BoundedByteArrayOutputStream.<init>(BoundedByteArrayOutputStream.java:46)
        at org.apache.tez.runtime.library.common.shuffle.MemoryFetchedInput.<init>(MemoryFetchedInput.java:38)
        at org.apache.tez.runtime.library.common.shuffle.impl.SimpleFetchedInputAllocator.allocate(SimpleFetchedInputAllocator.java:141)
        at org.apache.tez.runtime.library.common.shuffle.Fetcher.fetchInputs(Fetcher.java:717)
        at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:489)
        at org.apache.tez.runtime.library.common.shuffle.Fetcher.doHttpFetch(Fetcher.java:398)
        at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:195)
        at org.apache.tez.runtime.library.common.shuffle.Fetcher.callInternal(Fetcher.java:70)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException:
java.lang.RuntimeException: Map operator initialization failed
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException:
java.lang.RuntimeException: Map operator initialization failed
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Map operator initialization failed
        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
        ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException:
java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:388)
        at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:378)
        at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481)
        at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438)
        at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214)
        ... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap
space
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:386)
        ... 20 more
Caused by: java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.hive.serde2.WriteBuffers.nextBufferToWrite(WriteBuffers.java:241)
        at org.apache.hadoop.hive.serde2.WriteBuffers.write(WriteBuffers.java:217)
        at org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer$LazyBinaryKvWriter.writeKey(MapJoinBytesTableContainer.java:235)
        at org.apache.hadoop.hive.ql.exec.persistence.BytesBytesMultiHashMap.put(BytesBytesMultiHashMap.java:445)
        at org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer.putRow(MapJoinBytesTableContainer.java:365)
        at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:191)
        at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:288)
        at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:173)
        at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:169)
        at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
        at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
        ... 4 more
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException:
java.lang.RuntimeException: Map operator initialization failed
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Map operator initialization failed
        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
        ... 14 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1495595408051_21107_2_03
[Map 3] killed/failed due to:null]Vertex killed, vertexName=Reducer 7, vertexId=ve
rtex_1495595408051_21107_2_06, diagnostics=[Vertex received Kill while in RUNNING state.,
Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:2, Vertex vertex_1495595408051_211
07_2_06 [Reducer 7] killed/failed due to:null]Vertex killed, vertexName=Map 6, vertexId=vertex_1495595408051_21107_2_05,
diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed
 due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex vertex_1495595408051_21107_2_05
[Map 6] killed/failed due to:null]Vertex killed, vertexName=Map 5, vertexId=vertex_1495595408051_21107_2
_04, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due
to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex vertex_1495595408051_21107_2_04
[Map 5] killed/fai
led due to:null]Vertex killed, vertexName=Map 1, vertexId=vertex_1495595408051_21107_2_02,
diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE,
failedTasks:0 killedTasks:41, Vertex vertex_1495595408051_21107_2_02 [Map 1] killed/failed
due to:null]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:4

Please take a look. Thanks.

Regards,
Xin


Mime
View raw message