drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4764) Parquet file with INT_16, etc. logical types not supported by simple SELECT
Date Fri, 02 Dec 2016 18:34:58 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15715899#comment-15715899
] 

ASF GitHub Bot commented on DRILL-4764:
---------------------------------------

Github user parthchandra commented on a diff in the pull request:

    https://github.com/apache/drill/pull/673#discussion_r90694144
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ParquetFixedWidthDictionaryReaders.java
---
    @@ -56,6 +58,31 @@ protected void readField(long recordsToReadInThisPass) {
         }
       }
     
    +  /**
    +   * This class uses for reading unsigned integer fields.
    +   */
    +  static class DictionaryUInt4Reader extends FixedByteAlignedReader<UInt4Vector>
{
    +    DictionaryUInt4Reader(ParquetRecordReader parentReader, int allocateSize, ColumnDescriptor
descriptor,
    +                        ColumnChunkMetaData columnChunkMetaData, boolean fixedLength,
UInt4Vector v,
    +                        SchemaElement schemaElement) throws ExecutionSetupException {
    +      super(parentReader, allocateSize, descriptor, columnChunkMetaData, fixedLength,
v, schemaElement);
    +    }
    +
    +    // this method is called by its superclass during a read loop
    +    @Override
    +    protected void readField(long recordsToReadInThisPass) {
    +
    +      recordsReadInThisIteration = Math.min(pageReader.currentPageCount
    +        - pageReader.valuesRead, recordsToReadInThisPass - valuesReadInCurrentPass);
    +
    +      if (usingDictionary) {
    +        for (int i = 0; i < recordsReadInThisIteration; i++){
    +          valueVec.getMutator().setSafe(valuesReadInCurrentPass + i, pageReader.dictionaryValueReader.readInteger());
    --- End diff --
    
    you probably need to set writer index just as you do in the uint8 case.


> Parquet file with INT_16, etc. logical types not supported by simple SELECT
> ---------------------------------------------------------------------------
>
>                 Key: DRILL-4764
>                 URL: https://issues.apache.org/jira/browse/DRILL-4764
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>    Affects Versions: 1.6.0
>            Reporter: Paul Rogers
>            Assignee: Parth Chandra
>         Attachments: int_16.parquet, int_8.parquet, uint_16.parquet, uint_32.parquet,
uint_8.parquet
>
>
> Create a Parquet file with the following schema:
> message int16Data { required int32 index; required int32 value (INT_16); }
> Store it as int_16.parquet in the local file system. Query it with:
> SELECT * from `local`.`root`.`int_16.parquet`;
> The result, in the web UI, is this error:
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: UnsupportedOperationException:
unsupported type: INT32 INT_16 Fragment 0:0 [Error Id: c63f66b4-e5a9-4a35-9ceb-546b74645dd4
on 172.30.1.28:31010]
> The INT_16 logical (or "original") type simply tells consumers of the file that the data
is actually a 16-bit signed int. Presumably, this should tell Drill to use the SmallIntVector
(or NullableSmallIntVector) class for storage. Without supporting this annotation, even 16-bit
integers must be stored as 32-bits within Drill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message