hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suddhasatwa Bhaumik (JIRA)" <>
Subject [jira] [Created] (HIVE-18810) Parquet Or ORC
Date Tue, 27 Feb 2018 01:53:00 GMT
Suddhasatwa Bhaumik created HIVE-18810:

             Summary: Parquet Or ORC
                 Key: HIVE-18810
             Project: Hive
          Issue Type: Test
          Components: Hive
    Affects Versions: 1.1.0
         Environment: Hadoop 1.2.1

Hive 1.1
            Reporter: Suddhasatwa Bhaumik

Hello Experts, 

I would like to know for which data types (based on size and complexity of data) should one
be using Parquet or ORC tables in Hive. E.g., On Hadoop 0.20.0 with hive 0.13, the performance
of ORC tables in Hive is very good when accessed even by 3rd party BI systems like SAP Business
Objects or Tableau; performing the same tests on Hadoop 1.2.1 with Hive 1.1 does not yield
such reliability in queries, although ETL or insert/update of tables are taking nominal time
the read performance is not within acceptable limits. 

In case of any queries, kindly advise. 



This message was sent by Atlassian JIRA

View raw message