hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <>
Subject [jira] [Commented] (HIVE-14627) Improvements to MiniMr tests
Date Mon, 29 Aug 2016 18:48:21 GMT


Prasanth Jayachandran commented on HIVE-14627:

The test run is clean. Failures unrelated to this patch.  [~sseth] Can you please review this

> Improvements to MiniMr tests
> ----------------------------
>                 Key: HIVE-14627
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>    Affects Versions: 2.2.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>         Attachments: HIVE-14627.1.patch, HIVE-14627.2.patch, HIVE-14627.3.patch
> Currently MiniMr is extremely slow, I ran udf_using.q on MiniMr and following are the
execution time breakdown
> Total time - 13m59s
> Junit reported time for testcase - 50s
> Most of the time is spent in creating/loading/analyzing initial tables - ~12m
> Cleanup - ~1m
> There is huge overhead for running MiniMr tests when compared to the actual test runtime.

> Ran the same test without init script.
> Total time - 2m17s
> Junit reported time for testcase - 52s
> Also I noticed some tests that doesn't have to run on MiniMr (like udf_using.q that does
not require MiniMr. It just reads/write to hdfs which we can do in MiniTez/MiniLlap which
are way faster). Most tests access only very few initial tables to read few rows from it.
We can fix those tests to load just the table that is required for the table instead of all
initial tables. Also we can remove q_init_script.sql initialization for MiniMr after rewriting
and moving over the unwanted tests which should cut down the runtime a lot.  

This message was sent by Atlassian JIRA

View raw message