drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Sekhon (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-3525) Drill proper DESCRIBE support for Parquet
Date Tue, 21 Jul 2015 11:42:05 GMT
Hari Sekhon created DRILL-3525:

             Summary: Drill proper DESCRIBE support for Parquet
                 Key: DRILL-3525
                 URL: https://issues.apache.org/jira/browse/DRILL-3525
             Project: Apache Drill
          Issue Type: Bug
          Components: Metadata, Storage - Parquet
    Affects Versions: 1.1.0
            Reporter: Hari Sekhon
            Assignee: Steven Phillips

Request to add full DESCRIBE support for Parquet.

Currently the describe command results in a blank table being printed instead of the schema,
which is unhelpful, so I do a select * limit 1 instead.

While trying to describe lots of Parquet data could be inefficient, I propose the following

Read the first parquet file and assume that is the schema. Extend the DESCRIBE command to
have a user-configurable number of parquet files to read to present a merged schema for the
data source, as well as an ALL keywords to scan all parquet files to create true global schema.

This message was sent by Atlassian JIRA

View raw message