hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-9451) Add max size of column dictionaries to ORC metadata
Date Fri, 23 Jan 2015 17:14:34 GMT
Owen O'Malley created HIVE-9451:
-----------------------------------

             Summary: Add max size of column dictionaries to ORC metadata
                 Key: HIVE-9451
                 URL: https://issues.apache.org/jira/browse/HIVE-9451
             Project: Hive
          Issue Type: Improvement
            Reporter: Owen O'Malley


To predict the amount of memory required to read an ORC file we need to know the size of the
dictionaries for the columns that we are reading. I propose adding the number of bytes for
each column's dictionary to the stripe's column statistics. The file's column statistics would
have the maximum dictionary size for each column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message