hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Phillips (JIRA)" <j...@apache.org>
Subject [jira] Created: (HIVE-207) Change SerDe API to allow skipping unused columns
Date Mon, 05 Jan 2009 17:21:44 GMT
Change SerDe API to allow skipping unused columns

                 Key: HIVE-207
                 URL: https://issues.apache.org/jira/browse/HIVE-207
             Project: Hadoop Hive
          Issue Type: Bug
          Components: Query Processor, Serializers/Deserializers
            Reporter: David Phillips

A deserializer shouldn't have to deserialize columns that are never used by the query processor.
 A serializer shouldn't have to examine unused columns that are known to always be null.

As an example, we store data as a Protocol Buffer structure with ~60 fields.  Running a "select
count(1)" currently requires deserializing all fields, which includes checking if they exist
and formatting the data appropriately.  This is expensive and unnecessary.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message