orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ORC-350) Optionally disable/specify indexes for columns
Date Fri, 20 Apr 2018 18:17:00 GMT
Prasanth Jayachandran created ORC-350:
-----------------------------------------

             Summary: Optionally disable/specify indexes for columns
                 Key: ORC-350
                 URL: https://issues.apache.org/jira/browse/ORC-350
             Project: ORC
          Issue Type: Sub-task
            Reporter: Prasanth Jayachandran


There are many cases where entire xml or big json is stored as string column. If we autogenerate
indexes on those columns, we often run into issues with protobuf stream explosion. The only
workaround for now is to change from string to binary. It will be good to have an option to
disable indexes on specific columns. 

Regardless, I think we should have max limits on string column statistics. If that limit is
exceeded PPD should handle it accordingly (by returning YES_NO_NULL).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message