hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <>
Subject [jira] [Created] (HIVE-13232) Aggressively drop compression buffers in ORC OutStreams
Date Tue, 08 Mar 2016 19:07:41 GMT
Owen O'Malley created HIVE-13232:

             Summary: Aggressively drop compression buffers in ORC OutStreams
                 Key: HIVE-13232
             Project: Hive
          Issue Type: Bug
          Components: ORC
            Reporter: Owen O'Malley
            Assignee: Owen O'Malley

In Hive 0.11, when ORC's OutStream's were flushed they dropped all of the their buffers. In
the patch for HIVE-4342, we inadvertently changed that behavior so that one of the buffers
is held on to. For queries with a lot of writers and thus under significant memory pressure
this can have a significant impact on the memory usage. 

Note that "hive.optimize.sort.dynamic.partition" avoids this problem by sorting on the dynamic
partition key and thus only a single ORC writer is open at once. This will use memory more
effectively and avoid creating ORC files with very small stripes, which will produce better
downstream performance.

This message was sent by Atlassian JIRA

View raw message