hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Sprague <sprag...@gmail.com>
Subject Re: Slow performance on queries with aggregation function
Date Sat, 22 Feb 2014 04:24:45 GMT
Hi Jone,
um.  i can say for sure something is wrong. :)

i would _start_ by going to the tasktracker. this is your friend.  find
your job and look for failed reducers.  That's the starting point anyway,
IMHO.



On Fri, Feb 21, 2014 at 11:35 AM, Jone Lura <jone.lura@ecc.no> wrote:

> Hi,
>
> I have tried some variations of queries with aggregation function such as
> the following query;
>
> select max(total) from my_table;
>
> and
>
> select id, sum(total) from my_table group by id
>
> In my junit tests, I only have two rows with data, but the queries are
> extremely slow.
>
> The job detail output shows me the following;
>
> Hadoop job information for Stage-1: number of mappers: 1; number of
> reducers: 1
> 2014-02-21 17:31:42,544 Stage-1 map = 0%,  reduce = 0%
> 2014-02-21 17:31:45,548 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 17:31:46,899 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 17:31:55,446 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 17:32:34,358 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 17:32:40,040 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 17:32:45,653 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 17:32:46,999 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 17:32:55,544 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 17:33:34,454 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 17:33:40,130 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 17:33:45,742 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 17:33:47,093 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 17:33:55,632 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:27:48,005 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:27:48,461 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:27:48,311 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:27:48,574 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:27:48,932 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:28:48,915 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:28:48,915 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:28:48,933 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:28:48,933 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:28:49,727 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:29:47,995 Stage-1 map = 100%,  reduce = 100%
> 2014-02-21 19:29:48,997 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:29:49,018 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:29:49,019 Stage-1 map = 100%,  reduce = 0%
> 2014-02-21 19:29:49,824 Stage-1 map = 100%,  reduce = 0%
>
> I am relatively new to Hadoop and Hive and I do not know if this is
> normal, or if I have missed some configuration details.
>
> In my application I am expecting to have 500M or more rows.
>
> Best regards,
>
> Jone
>

Mime
View raw message