hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koert Kuipers <ko...@tresata.com>
Subject Re: why 1 reducer on simple join?
Date Fri, 13 Jan 2012 02:24:37 GMT
that query without the create table turns into a map-join and runs fast
without any reducers.
if i turn map-join off then it goes back to map-reduce with 1 reducer and
ignores mapred.reduce.tasks again.
i am using hive 0.7

On Thu, Jan 12, 2012 at 6:28 PM, Wojciech Langiewicz
<wlangiewicz@gmail.com>wrote:

> I ment this query (without create table....):
>
> select x.* from table1 x join table2 y where (
> x.col1 = y.col1 and
> x.col2 = y.col2 and
> x.col3 = y.col3 and
> x.col4 = y.col4 and
> x.col5 = y.col5
> );
>
> this document might be useful: https://cwiki.apache.org/Hive/**
> joinoptimization.html<https://cwiki.apache.org/Hive/joinoptimization.html>
>
> Especially try this setting:
> set hive.auto.convert.join = true; (or false)
>
> Which version of Hive are you using?
>
>
> On 13.01.2012 00:24, Koert Kuipers wrote:
>
>> hive>  set mapred.reduce.tasks = 3;
>> hive>  select count(*) from table1 group by column1 limit 10;
>> query runs with 38 mappers and 3 reducers
>>
>> hive>  select count(*) from table2 group by column1 limit 10;
>> query runs with 6 mappers and 3 reducers
>>
>> On Thu, Jan 12, 2012 at 6:09 PM, Wojciech Langiewicz
>> <wlangiewicz@gmail.com>wrote:
>>
>>  What do you mean by "Select runs fine" - is it using number of reducers
>>> that you set?
>>> It might help if you could show actual query.
>>>
>>>
>>> On 13.01.2012 00:03, Koert Kuipers wrote:
>>>
>>>  I tried set mapred.reduce.tasks = xyz; hive ignored it.
>>>> Selects run fine. The query uses 44 mappers.
>>>>
>>>> On Thu, Jan 12, 2012 at 6:00 PM, Wojciech Langiewicz
>>>> <wlangiewicz@gmail.com>wrote:
>>>>
>>>>  Hello,
>>>>
>>>>> Have you tried running only select, without creating table? What are
>>>>> results?
>>>>> How did you tried to set number of reducers? Have you used this:
>>>>> set mapred.reduce.tasks = xyz;
>>>>> How many mappers does this query use?
>>>>>
>>>>>
>>>>> On 12.01.2012 23:53, Koert Kuipers wrote:
>>>>>
>>>>>  I am running a basic join of 2 tables and it will only run with 1
>>>>>
>>>>>> reducer.
>>>>>> why is that? i tried to set the number of reducers and it didn't
work.
>>>>>> hive
>>>>>> just ignored it.
>>>>>>
>>>>>> create table z as select x.* from table1 x join table2 y where (
>>>>>> x.col1 = y.col1 and
>>>>>> x.col2 = y.col2 and
>>>>>> x.col3 = y.col3 and
>>>>>> x.col4 = y.col4 and
>>>>>> x.col5 = y.col5
>>>>>> );
>>>>>>
>>>>>> both tables are backed by multiple files / blocks / chunks
>>>>>>
>>>>>>
>>>>>>  --
>>>>>>
>>>>> Wojciech Langiewicz
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message