hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-14627) Improvements to MiniMr tests
Date Thu, 25 Aug 2016 00:11:20 GMT
Prasanth Jayachandran created HIVE-14627:
--------------------------------------------

             Summary: Improvements to MiniMr tests
                 Key: HIVE-14627
                 URL: https://issues.apache.org/jira/browse/HIVE-14627
             Project: Hive
          Issue Type: Sub-task
    Affects Versions: 2.2.0
            Reporter: Prasanth Jayachandran
            Assignee: Prasanth Jayachandran


Currently MiniMr is extremely slow, I ran udf_using.q on MiniMr and following are the execution
time breakdown

Total time - 13m59s
Junit reported time for testcase - 50s
Most of the time is spent in creating/loading/analyzing initial tables - ~12m
Cleanup - ~1m

There is huge overhead for running MiniMr tests when compared to the actual test runtime.


Also I noticed some tests that doesn't have to run on MiniMr (like udf_using.q that does not
require MiniMr. It just reads/write to hdfs which we can do in MiniTez/MiniLlap which are
way faster). Most tests access only very few initial tables to read few rows from it. We can
fix those tests to load just the table that is required for the table instead of all initial
tables. Also we can remove q_init_script.sql initialization for MiniMr after rewriting and
moving over the unwanted tests which should cut down the runtime a lot.  




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message