orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ORC-435) Ability to read stripes that are greater than 2GB
Date Fri, 09 Nov 2018 21:36:00 GMT
Prasanth Jayachandran created ORC-435:

             Summary: Ability to read stripes that are greater than 2GB
                 Key: ORC-435
                 URL: https://issues.apache.org/jira/browse/ORC-435
             Project: ORC
          Issue Type: Bug
          Components: Reader
    Affects Versions: 1.5.3, 1.4.4, 1.3.4, 1.6.0
            Reporter: Prasanth Jayachandran
             Fix For: 1.5.4, 1.6.0

ORC reader fails with NegativeArraySizeException if the stripe size is >2GB. Even though
default stripe size is 64MB there are cases where stripe size will reach >2GB even before
memory manager can kick in to check memory size. Say if we are inserting 500KB strings (mostly
unique) by the time we reach 5000 rows stripe size is already over 2GB. Reader will have to
chunk the disk range reads for such cases instead of reading the stripe as whole blob. 

This message was sent by Atlassian JIRA

View raw message