drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Altekruse (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-2954) OOM: CTAS from JSON to Parquet on a single wide row JSON file
Date Mon, 04 May 2015 22:44:07 GMT

     [ https://issues.apache.org/jira/browse/DRILL-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jason Altekruse updated DRILL-2954:
-----------------------------------
    Component/s:     (was: Execution - Data Types)
                 Storage - Writer
                 Storage - Parquet

> OOM: CTAS from JSON to Parquet on a single wide row JSON file
> -------------------------------------------------------------
>
>                 Key: DRILL-2954
>                 URL: https://issues.apache.org/jira/browse/DRILL-2954
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet, Storage - Writer
>    Affects Versions: 0.9.0
>            Reporter: Chun Chang
>            Assignee: Daniel Barclay (Drill)
>            Priority: Blocker
>         Attachments: singlewide.json
>
>
> #Generated by Git-Commit-Id-Plugin
> #Sun May 03 18:33:43 EDT 2015
> git.commit.id.abbrev=10833d2
> Have a single row JSON file, with nested structure about 5 levels deep. The file size
is about 3.8M. So, a single row of size about 3.8M.
> Converting this file to parquet using CTAS, drillbit quickly ran out of memory.
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexP> create table `singlewide.json` as select
* from dfs.`/drill/testdata/complex/json/singlewide.json`;
> Query failed: RESOURCE ERROR: One or more nodes ran out of memory while executing the
query.
> Fragment 0:0
> [c6ec52c8-8307-4313-97c8-b9da9e3125e5 on qa-node119.qa.lab:31010]
> Error: exception while executing query: Failure while executing query. (state=,code=0)
> {code}
> drillbit log:
> {code}
> 2015-05-04 14:20:13,071 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:foreman] INFO  o.a.drill.exec.work.foreman.Foreman
- State change requested.  PENDING --> RUNNING
> 2015-05-04 14:20:13,254 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  o.a.d.e.w.fragment.FragmentExecutor
- 2ab81d73-3343-1627-1f34-bb4e88bb4c0c:0:0: State change requested from AWAITING_ALLOCATION
--> RUNNING for
> 2015-05-04 14:20:13,255 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  o.a.d.e.w.f.AbstractStatusReporter
- State changed for 2ab81d73-3343-1627-1f34-bb4e88bb4c0c:0:0. New state: RUNNING
> 2015-05-04 14:20:45,486 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  o.a.d.c.e.DrillRuntimeException
- User Error Occurred
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more nodes ran
out of memory while executing the query.
> [c6ec52c8-8307-4313-97c8-b9da9e3125e5 ]
> 	at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:465)
~[drill-common-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:210)
[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-0.9.0-rebuffed.jar:0.9.0]
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45]
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45]
> 	at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: java.lang.OutOfMemoryError: Direct buffer memory
> 	at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_45]
> 	at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) ~[na:1.7.0_45]
> 	at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) ~[na:1.7.0_45]
> 	at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:437) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.buffer.PoolArena.allocate(PoolArena.java:98) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
> 	at io.netty.buffer.PooledByteBufAllocatorL.newDirectBuffer(PooledByteBufAllocatorL.java:46)
~[drill-java-exec-0.9.0-rebuffed.jar:4.0.24.Final]
> 	at io.netty.buffer.PooledByteBufAllocatorL.directBuffer(PooledByteBufAllocatorL.java:66)
~[drill-java-exec-0.9.0-rebuffed.jar:4.0.24.Final]
> 	at org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:227)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:234)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.store.parquet.ParquetDirectByteBufferAllocator.allocate(ParquetDirectByteBufferAllocator.java:45)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at parquet.bytes.CapacityByteArrayOutputStream.allocateSlab(CapacityByteArrayOutputStream.java:70)
~[parquet-encoding-1.6.0rc3-drill-r0.1.jar:1.6.0rc3-drill-r0.1]
> 	at parquet.bytes.CapacityByteArrayOutputStream.initSlabs(CapacityByteArrayOutputStream.java:84)
~[parquet-encoding-1.6.0rc3-drill-r0.1.jar:1.6.0rc3-drill-r0.1]
> 	at parquet.bytes.CapacityByteArrayOutputStream.<init>(CapacityByteArrayOutputStream.java:65)
~[parquet-encoding-1.6.0rc3-drill-r0.1.jar:1.6.0rc3-drill-r0.1]
> 	at parquet.hadoop.ColumnChunkPageWriteStore$ColumnChunkPageWriter.<init>(ColumnChunkPageWriteStore.java:76)
~[parquet-hadoop-1.6.0rc3-drill-r0.1.jar:1.6.0rc3-drill-r0.1]
> 	at parquet.hadoop.ColumnChunkPageWriteStore$ColumnChunkPageWriter.<init>(ColumnChunkPageWriteStore.java:51)
~[parquet-hadoop-1.6.0rc3-drill-r0.1.jar:1.6.0rc3-drill-r0.1]
> 	at parquet.hadoop.ColumnChunkPageWriteStore.<init>(ColumnChunkPageWriteStore.java:235)
~[parquet-hadoop-1.6.0rc3-drill-r0.1.jar:1.6.0rc3-drill-r0.1]
> 	at parquet.hadoop.ColumnChunkPageWriteStoreExposer.newColumnChunkPageWriteStore(ColumnChunkPageWriteStoreExposer.java:39)
~[drill-java-exec-0.9.0-rebuffed.jar:1.6.0rc3-drill-r0.1]
> 	at org.apache.drill.exec.store.parquet.ParquetRecordWriter.newSchema(ParquetRecordWriter.java:158)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.store.parquet.ParquetRecordWriter.updateSchema(ParquetRecordWriter.java:142)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.physical.impl.WriterRecordBatch.setupNewSchema(WriterRecordBatch.java:162)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:113)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:144)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:101)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:91)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:130)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:144)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:74) ~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:64) ~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:199)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:193)
~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at java.security.AccessController.doPrivileged(Native Method) ~[na:1.7.0_45]
> 	at javax.security.auth.Subject.doAs(Subject.java:415) ~[na:1.7.0_45]
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1469)
~[hadoop-common-2.4.1-mapr-1408.jar:na]
> 	at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:193)
[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	... 4 common frames omitted
> 2015-05-04 14:20:45,487 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  o.a.d.e.w.fragment.FragmentExecutor
- 2ab81d73-3343-1627-1f34-bb4e88bb4c0c:0:0: State change requested from RUNNING --> FAILED
for
> 2015-05-04 14:20:45,691 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  o.a.d.e.w.fragment.FragmentExecutor
- 2ab81d73-3343-1627-1f34-bb4e88bb4c0c:0:0: State change requested from FAILED --> FAILED
for
> 2015-05-04 14:20:45,692 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  o.a.d.e.w.fragment.FragmentExecutor
- 2ab81d73-3343-1627-1f34-bb4e88bb4c0c:0:0: State change requested from FAILED --> FAILED
for
> 2015-05-04 14:20:45,701 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  o.a.drill.exec.work.foreman.Foreman
- State change requested.  RUNNING --> FAILED
> org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One or more nodes
ran out of memory while executing the query.
> Fragment 0:0
> [c6ec52c8-8307-4313-97c8-b9da9e3125e5 on qa-node119.qa.lab:31010]
> 	at org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:409)
[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:389)
[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:90)
[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:86)
[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:266)
[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:232)
[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
> 	at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-0.9.0-rebuffed.jar:0.9.0]
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_45]
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_45]
> 	at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> 2015-05-04 14:20:45,721 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  o.a.d.e.w.fragment.FragmentExecutor
- 2ab81d73-3343-1627-1f34-bb4e88bb4c0c:0:0: State change requested from FAILED --> CANCELLATION_REQUESTED
for
> 2015-05-04 14:20:45,722 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] WARN  o.a.d.e.w.fragment.FragmentExecutor
- Ignoring unexpected state transition FAILED => CANCELLATION_REQUESTED.
> 2015-05-04 14:20:45,722 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  o.a.drill.exec.work.foreman.Foreman
- foreman cleaning up.
> 2015-05-04 14:20:45,722 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  o.a.drill.exec.work.foreman.Foreman
- State change requested.  FAILED --> COMPLETED
> 2015-05-04 14:20:45,722 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] WARN  o.a.drill.exec.work.foreman.Foreman
- Dropping request to move to COMPLETED state as query is already at FAILED state (which is
terminal).
> {code}
> drill bit memory went up to 9.3G
> {code}
> Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
> Cpu(s):  0.3%us,  0.0%sy,  0.0%ni, 99.5%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st
> Mem:  49416400k total, 33292808k used, 16123592k free,   198656k buffers
> Swap: 52428796k total,        0k used, 52428796k free,  1615500k cached
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  5546 mapr      20   0 14.2g 9.3g  38m S  0.3 19.8   1:48.15 java
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message