impala-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Armstrong <tarmstr...@cloudera.com>
Subject Re: Debugging Impala query that consistently hangs
Date Sat, 10 Feb 2018 01:41:39 GMT
To be clearer, the main problem with that plan is that the join order is
bad. Broadcast vs shuffle is a secondary issue. The query doesn't look that
complex so with stats you should get a reasonable plan without hinting.

On 9 Feb. 2018 17:29, "Tim Armstrong" <tarmstrong@cloudera.com> wrote:

> Most of the intelligence in the planning process relies on having stats,
> including the BROADCAST/SHUFFLE join mode selection.
>
> If you compute stats you'll have a much better experience.
>
> On Fri, Feb 9, 2018 at 11:44 AM, Piyush Narang <p.narang@criteo.com>
> wrote:
>
>> Actually, looking at this again, the hash join that is consuming 179GB is
>> supposed to be partitioned right? How would stats change that?
>>
>> I checked the query I kicked off and I have this there, “left outer join
>> /* +SHUFFLE */”. I think without it I end up with query failures.
>>
>>
>>
>> Is there something I’m missing?
>>
>>
>>
>> -- Piyush
>>
>>
>>
>>
>>
>> *From: *Tim Armstrong <tarmstrong@cloudera.com>
>> *Reply-To: *"user@impala.apache.org" <user@impala.apache.org>
>> *Date: *Friday, February 9, 2018 at 12:24 PM
>> *To: *"user@impala.apache.org" <user@impala.apache.org>
>> *Subject: *Re: Debugging Impala query that consistently hangs
>>
>>
>>
>> 07:HASH JOIN              1    0.000ns    0.000ns        0          -1
>> 179.72 GB        2.00 GB  LEFT OUTER JOIN, PARTITIONED
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Mime
View raw message