Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 102A011841 for ; Fri, 16 May 2014 21:31:28 +0000 (UTC) Received: (qmail 85132 invoked by uid 500); 16 May 2014 17:48:29 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 78844 invoked by uid 500); 16 May 2014 17:48:24 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 46906 invoked by uid 99); 16 May 2014 17:44:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 May 2014 17:44:43 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of premal.j.shah@gmail.com designates 209.85.213.53 as permitted sender) Received: from [209.85.213.53] (HELO mail-yh0-f53.google.com) (209.85.213.53) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 May 2014 17:38:13 +0000 Received: by mail-yh0-f53.google.com with SMTP id i57so4653569yha.12 for ; Fri, 16 May 2014 10:37:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=25a+dDJTdj1qWhYGoB5vz92guSHjM3E6SZYkoenZAjM=; b=Ol8d8k/aMtdFGVZvnfpJ7zNr9JCMGzvjjQvnD6CkiP4gthMQ5uLWm4ZftLLMR/0tm5 pbTYotwgvS9UmNI8Hlnx8xgRWgZiNtxsoWYnoOYOCxxxtYFT+4Irv1jqVeZqGDY7mKXB xI9yYE/mhVs+dom9fhEg3HWJ2M0+H9JcByXpXDAYSbOkDtyegrHdwpROc8nR8K8lX3Au qpl5rc1/24hdLbHQwnvZoSnsWWcO69cBCPgS0C4E4Omzbkg6PHMo5XNoRWSsHLhcs23l xyi4w0XlvDFb5XJ2jeFdfkAbH/1NVvVTyCO8atKTYBndZcHSbTDdgJA2aOZUHWMl6Nq/ 6v4g== MIME-Version: 1.0 X-Received: by 10.236.128.180 with SMTP id f40mr27484165yhi.71.1400261872603; Fri, 16 May 2014 10:37:52 -0700 (PDT) Received: by 10.170.53.139 with HTTP; Fri, 16 May 2014 10:37:52 -0700 (PDT) In-Reply-To: References: Date: Fri, 16 May 2014 10:37:52 -0700 Message-ID: Subject: Re: ORC file in Hive 0.13 throws Java heap space error From: Premal Shah To: user@hive.apache.org Content-Type: multipart/alternative; boundary=20cf300fb4c782577204f987df6f X-Virus-Checked: Checked by ClamAV on apache.org --20cf300fb4c782577204f987df6f Content-Type: text/plain; charset=UTF-8 Sorry for the double post. I did not show up for a while and then I could not get to the archives page, so I thought I'd needed to resend. On Fri, May 16, 2014 at 12:54 AM, Premal Shah wrote: > I have a table in hive stored as text file with 3283 columns. All columns > are of string data type. > > I'm trying to convert that table into an orc file table using this command > *create table orc_table stored as orc as select * from text_table;* > > This is the setting under mapred-site.xml > > > mapred.child.java.opts > -Xmx4G -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode > -verbose:gc -Xloggc:/mnt/hadoop/@taskid@.gc > true > > > The tasks die with this error > > 2014-05-16 00:53:42,424 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space > at java.nio.HeapByteBuffer.(HeapByteBuffer.java:39) > at java.nio.ByteBuffer.allocate(ByteBuffer.java:312) > at org.apache.hadoop.hive.ql.io.orc.OutStream.getNewOutputBuffer(OutStream.java:117) > at org.apache.hadoop.hive.ql.io.orc.OutStream.spill(OutStream.java:168) > at org.apache.hadoop.hive.ql.io.orc.OutStream.flush(OutStream.java:239) > at org.apache.hadoop.hive.ql.io.orc.RunLengthByteWriter.flush(RunLengthByteWriter.java:58) > at org.apache.hadoop.hive.ql.io.orc.BitFieldWriter.flush(BitFieldWriter.java:44) > at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:553) > at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStripe(WriterImpl.java:1012) > at org.apache.hadoop.hive.ql.io.orc.WriterImpl$ListTreeWriter.writeStripe(WriterImpl.java:1455) > at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1400) > at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1780) > at org.apache.hadoop.hive.ql.io.orc.WriterImpl.checkMemory(WriterImpl.java:221) > at org.apache.hadoop.hive.ql.io.orc.MemoryManager.notifyWriters(MemoryManager.java:168) > at org.apache.hadoop.hive.ql.io.orc.MemoryManager.addedRow(MemoryManager.java:157) > at org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:2028) > at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:86) > at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:622) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) > at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:87) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) > at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:92) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793) > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540) > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) > > > This is the GC output for a task that ran out of memory > > 0.690: [GC 17024K->768K(83008K), 0.0019170 secs] > 0.842: [GC 8488K(83008K), 0.0066800 secs] > 1.031: [GC 17792K->1481K(83008K), 0.0015400 secs] > 1.352: [GC 17142K(83008K), 0.0041840 secs] > 1.371: [GC 18505K->2249K(83008K), 0.0097240 secs] > 34.779: [GC 28384K(4177280K), 0.0014050 secs] > > > Anything I can tweak to make it work? > > -- > Regards, > Premal Shah. > -- Regards, Premal Shah. --20cf300fb4c782577204f987df6f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Sorry for the double post. I did not show up for a while a= nd then I could not get to the archives page, so I thought I'd needed t= o resend.


On Fri, May 16, 2014 at 12:54 AM, Premal Shah <premal.j.shah@gmail= .com> wrote:
I = have a table in hive stored as text file with 3283 columns. All columns are= of string data type.

I'= m trying to convert that table into an orc file table using this command
create tab= le orc_table stored as orc as select * from text_table;

=
This is the sett= ing under mapred-site.xml

&= lt;property>
=C2=A0 =C2=A0 <name>mapred.child.java.opts&= lt;/name>
=C2=A0 =C2=A0 <value>-Xmx4G -XX:+UseConcMarkSw= eepGC -XX:+CMSIncrementalMode -verbose:gc -Xloggc:/mnt/hadoop/@taskid@.gc&l= t;/value>
=C2=A0 =C2=A0 <final>true</final>
=C2=A0 </pr= operty>

= The tasks die with this error

2014-05-16 00:53:42,424 FATAL org.apache.had=
oop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java he=
ap space
	at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
	at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
	at org.apache.hadoop.hive.ql.io.orc.OutStream.getNewOutputBuffer(OutStream=
.java:117)
	at org.apache.hadoop.hive.ql.io.orc.OutStream.spill(OutStream.java:168)
	at org.apache.hadoop.hive.ql.io.orc.OutStream.flush(OutStream.java:239)
	at org.apache.hadoop.hive.ql.io.orc.RunLengthByteWriter.flush(RunLengthByt=
eWriter.java:58)
	at org.apache.hadoop.hive.ql.io.orc.BitFieldWriter.flush(BitFieldWriter.ja=
va:44)
	at org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(Writ=
erImpl.java:553)
	at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStrip=
e(WriterImpl.java:1012)
	at org.apache.hadoop.hive.ql.io.orc.WriterImpl$ListTreeWriter.writeStripe(=
WriterImpl.java:1455)
	at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStrip=
e(WriterImpl.java:1400)
	at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java=
:1780)
	at org.apache.hadoop.hive.ql.io.orc.WriterImpl.checkMemory(WriterImpl.java=
:221)
	at org.apache.hadoop.hive.ql.io.orc.MemoryManager.notifyWriters(MemoryMana=
ger.java:168)
	at org.apache.hadoop.hive.ql.io.orc.MemoryManager.addedRow(MemoryManager.j=
ava:157)
	at org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:2028=
)
	at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(=
OrcOutputFormat.java:86)
	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOpera=
tor.java:622)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
	at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.=
java:87)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOpe=
rator.java:92)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:793)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:540=
)
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:177)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformati=
on.java:1190)
This is the GC =
output for a task that ran out of memory
0.690: [GC 1702=
4K->768K(83008K), 0.0019170 secs]
0.842: [GC 8488K(83008K), 0.0066800 secs]
1.031: [GC 17792K->1481K(83008K), 0.0015400 secs]
1.352: [GC 17142K(83008K), 0.0041840 secs]
1.371: [GC 18505K->2249K(83008K), 0.0097240 secs]
34.779: [GC 28384K(4177280K), 0.0014050 secs]

Anything I can tweak to make it work?

--
Regards,
Premal Shah.



--
= Regards,
Premal Shah.
--20cf300fb4c782577204f987df6f--