drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Altekruse (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-2031) IndexOutOfBoundException when reading a wide parquet table with boolean columns
Date Fri, 23 Jan 2015 19:25:36 GMT

    [ https://issues.apache.org/jira/browse/DRILL-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289785#comment-14289785
] 

Jason Altekruse commented on DRILL-2031:
----------------------------------------

It is definitely possible to an extent. The comment in the patch describes the old strategy
which did bulk copies until we hit a condition where it was an issue and we had to fall back
on copying a byte at a time with a shift. In this case it just isn't a large enough performance
bottleneck to justify trying to debug the complex code that was there. A significant amount
of time has been invested in the parquet reader and we need to just prioritize accuracy for
now.

> IndexOutOfBoundException when reading a wide parquet table with boolean columns
> -------------------------------------------------------------------------------
>
>                 Key: DRILL-2031
>                 URL: https://issues.apache.org/jira/browse/DRILL-2031
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 0.7.0
>            Reporter: Aman Sinha
>            Assignee: Parth Chandra
>            Priority: Critical
>         Attachments: DRILL-2031-Parquet-bit-reader-fix.patch, wide1.sql
>
>
> I created a wide table with 128 Lineitem columns plus 6 additional boolean columns for
a total of 134 columns via a CTAS script (see attached SQL).  The source data is from TPCH
scale factor 1 (smaller scale factor may not reproduce the problem). The creation of the table
was Ok.  Reading from the table gives an IOBE.  See stack below.  It seems to occur for the
boolean columns.  
> {code}
> 0: jdbc:drill:zk=local> select * from wide1 where 1=0;
> java.lang.IndexOutOfBoundsException: srcIndex: 97792
> 	io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:255)
~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
> 	io.netty.buffer.WrappedByteBuf.setBytes(WrappedByteBuf.java:378) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
> 	io.netty.buffer.UnsafeDirectLittleEndian.setBytes(UnsafeDirectLittleEndian.java:25)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
> 	io.netty.buffer.DrillBuf.setBytes(DrillBuf.java:645) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
> 	io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:850) ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
> 	org.apache.drill.exec.store.parquet.columnreaders.BitReader.readField(BitReader.java:54)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> 	org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.readValues(ColumnReader.java:120)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> 	org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.processPageData(ColumnReader.java:169)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> 	org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.determineSize(ColumnReader.java:146)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> 	org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.processPages(ColumnReader.java:107)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> 	org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.readAllFixedFields(ParquetRecordReader.java:367)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> 	org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:413)
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> 	org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:158) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message