hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hemanth Meka <hemanth.m...@datametica.com>
Subject Orc memory issue
Date Fri, 18 Dec 2015 09:48:55 GMT
Hi All,

We have a source table and a target table. data in source table is text and without partitions
and target table is a orc table with 5 partition columns and 6000 records and 1400 partitions.

We are trying to insert overwrite the target table with the data in the source table. This
insert overwrite command causes an memory exception which is pasted below.

It works if we change target table from ORC to TEXT or we keep it as an orc table and remove
partitions.
It also runs fine for 1000 records

Did anyone face such issue with orc. Any workarounds are much appreciated.

we tried options like:

    ("orc.compress"="SNAPPY");
    ("orc.stripe.size"=67108864);
    ("orc.compress.size"=8192);
    set mapreduce.map.memory.mb=4096;
    set mapreduce.reduce.memory.mb=5120;
    using distribute by and sort by

PS:

decreasing number of parrtition columns to 3 with 65 partition values works fine

Logging initialized using configuration in file:/etc/hive/conf/hive-log4j.properties
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.6.0-2800/hadoop/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hdp/2.2.6.0-2800/hive/lib/hive-jdbc-0.14.0.2.2.6.0-2800-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Query ID = sindhurak_20151218140101_3616d146-a1e6-4676-8865-de15d8b5b7eb
Total jobs = 1
Launching Job 1 out of 1


Status: Running (Executing on YARN cluster with App id application_1449146890024_15817)

--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Reducer 2             FAILED      1          0        0        1       4       0
--------------------------------------------------------------------------------
VERTICES: 01/02  [=============>>-------------] 50%   ELAPSED TIME: 86.62 s
--------------------------------------------------------------------------------
Status: Failed
Vertex failed, vertexName=Reducer 2, vertexId=vertex_1449146890024_15817_1_01, diagnostics=[Task
failed, taskId=task_1449146890024_15817_1_01_000000, diagnostics=[TaskAttempt 0 failed, info=[Container
container_e24_1449146890024_15817_01_000002 finished with diagnostics set to [Container failed.
Exception from container-launch.
Container id: container_e24_1449146890024_15817_01_000002
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
]], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException:
java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:172)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:138)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerWriterV2.<init>(RunLengthIntegerWriterV2.java:145)
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.createIntegerWriter(WriterImpl.java:642)
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.<init>(WriterImpl.java:1058)
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl.createTreeWriter(WriterImpl.java:1816)
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl.access$1500(WriterImpl.java:98)
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.<init>(WriterImpl.java:1587)
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl.createTreeWriter(WriterImpl.java:1841)
        at org.apache.hadoop.hive.ql.io.orc.WriterImpl.<init>(WriterImpl.java:195)
        at org.apache.hadoop.hive.ql.io.orc.OrcFile.createWriter(OrcFile.java:435)
        at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:84)
        at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:689)
        at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
        at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:328)
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:218)
        at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:168)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:163)
        ... 13 more
], TaskAttempt 2 failed, info=[Container container_e24_1449146890024_15817_01_000004 finished
with diagnostics set to [Container failed. Exception from container-launch.
Container id: container_e24_1449146890024_15817_01_000004
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
]], TaskAttempt 3 failed, info=[Container container_e24_1449146890024_15817_01_000005 finished
with diagnostics set to [Container failed. ]]], Vertex failed as one or more tasks failed.
failedTasks:1, Vertex vertex_1449146890024_15817_1_01 [Reducer 2] killed/failed due to:null]
DAG failed due to vertex failure. failedVertices:1 killedVertices:0
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask


Mime
View raw message