hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nemon Lou (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-12847) ORC file footer cache should be memory sensitive
Date Tue, 12 Jan 2016 12:43:39 GMT
Nemon Lou created HIVE-12847:
--------------------------------

             Summary: ORC file footer cache should be memory sensitive
                 Key: HIVE-12847
                 URL: https://issues.apache.org/jira/browse/HIVE-12847
             Project: Hive
          Issue Type: Improvement
          Components: File Formats, ORC
    Affects Versions: 1.2.1
            Reporter: Nemon Lou


The size based footer cache can not control memory usage properly.
Having seen a HiveServer2 hang due to ORC file footer cache taking up too much heap memory.
A simple query like "select * from orc_table limit 1" can make HiveServer2 hang.
The input table has about 1000 ORC files and each ORC file owns about 2500 stripes.
{noformat}
 num     #instances         #bytes  class name
----------------------------------------------
   1:     214653601    25758432120  org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics
   3:     122233301     8800797672  org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics
   5:      89439001     6439608072  org.apache.hadoop.hive.ql.io.orc.OrcProto$IntegerStatistics
   7:       2981300      262354400  org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeInformation
   9:       2981300      143102400  org.apache.hadoop.hive.ql.io.orc.OrcProto$StripeStatistics
  12:       2983691       71608584  org.apache.hadoop.hive.ql.io.orc.ReaderImpl$StripeInformationImpl
  15:         80929        7121752  org.apache.hadoop.hive.ql.io.orc.OrcProto$Type
  17:        103282        5783792  org.apache.hadoop.mapreduce.lib.input.FileSplit
  20:         51641        3305024  org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit
  21:         51641        3305024  org.apache.hadoop.hive.ql.io.orc.OrcSplit
  31:             1         413152  [Lorg.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit;
 
 100:          1122          26928  org.apache.hadoop.hive.ql.io.orc.Metadata  
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message