Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3084018D57 for ; Mon, 19 Oct 2015 09:37:59 +0000 (UTC) Received: (qmail 84994 invoked by uid 500); 19 Oct 2015 09:37:44 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 84934 invoked by uid 500); 19 Oct 2015 09:37:44 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 84924 invoked by uid 99); 19 Oct 2015 09:37:44 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Oct 2015 09:37:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 178BD180A88 for ; Mon, 19 Oct 2015 09:37:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.588 X-Spam-Level: X-Spam-Status: No, score=0.588 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=booking.com Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id wKURfGmlxucS for ; Mon, 19 Oct 2015 09:37:42 +0000 (UTC) Received: from mail-out8.booking.com (mail-out8.booking.com [5.57.21.182]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 65CF42074E for ; Mon, 19 Oct 2015 09:37:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=booking.com; s=bk; h=Content-Type:In-Reply-To:MIME-Version:Date:Message-ID:From:References:To:Subject; bh=eUSUeTeCUcuE3ddJ6Y29OtTlf/InzExSHQTPiTj16ko=; b=K7jPfypWK+uCHwvdMI//sMZjuB328YHIxWSUdZbMsmrzdjhy1E9Ulro3LcPamkTO3i3n6I8bbqAHI3xv6icob+JlzAOg2wcIqDZJbFT9RC0gAm8GDm+S6S0xI5J50u8ZYX33pEbWvikP6V4T5e0fl7SLfXZdlz0LBTRO+q0JlbI=; Received: from corpds-101.ams4.corp.booking.com ([10.196.68.22]:34399) by mtx-105.ams4.prod.booking.com with esmtps (TLSv1.2:DHE-RSA-AES256-GCM-SHA384:256) (Exim 4.84) (envelope-from ) id 1Zo6sq-0003qx-SU; Mon, 19 Oct 2015 11:37:36 +0200 Received: from [10.155.73.21] (port=51940) by corpds-101.ams4.corp.booking.com with esmtpsa (TLSv1.2:DHE-RSA-AES128-SHA:128) (Exim 4.84) (envelope-from ) id 1Zo6sq-000FrQ-OE; Mon, 19 Oct 2015 11:37:36 +0200 Subject: Re: Strict mode and joins To: user@hive.apache.org, edlinuxguru@gmail.com References: From: Bennie Schut Message-ID: <5624B9E0.2050601@booking.com> Date: Mon, 19 Oct 2015 11:37:36 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/alternative; boundary="------------000103010804020307030708" This is a multi-part message in MIME format. --------------000103010804020307030708 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Hi Edward, That's possibly due to using unix_timestamp (although the error message seems misleading if that proves true) . It's technically correct it shouldn't be flagged as deterministic because every time you call it you'll get a different answer as time progresses. However reality is I just want it called 1 time which is during planning and if I flag is as deterministic this is exactly what happens so you can do this: @UDFType(deterministic = true) public class UnixTimeStamp extends GenericUDFUnixTimeStamp { // Making the udf deterministic which is kind of cheating but makes partition pruning work. } And then register the udf like you normally would. If that's not helping do some creative sub-querying might help like FROM (select * from entry_hourly_v3 where dt=2015101517 ) entry_hourly_v3 INNER JOIN article_meta ON Bennie. On 15/10/15 23:06, Edward Capriolo wrote: > So I have strict mode on and I like to keep it that way. > > I am trying to do this query. > > INSERT OVERWRITE TABLE vertical_stats_recent PARTITION (dt=2015101517) > SELECT ... > FROM entry_hourly_v3 INNER JOIN article_meta ON > entry_hourly_v3.entry_id = article_meta.entry_id > INNER JOIN channel_meta ON > channel_meta.section_name = article_meta.channel > > WHERE entry_hourly_v3.dt=2015101517 > AND article_meta.dt=2015101517 > AND channel_meta.hitdate=20151015 > AND article_meta.publish_timestamp > ((unix_timestamp() * 1000) - > (1000 * 60 * 60 * 24 * 2)) > GROUP > > entry_hourly_v3, channel_meta and article_meta are partitioned tables. > > *Your query has the following error(s):* > > Error while compiling statement: FAILED: SemanticException [Error > 10041]: No partition predicate found for Alias "entry_hourly_v3" Table > "entry_hourly_v3" > > I also tried putting views on the table and I had no luck. > > Is there any way I can do this query without turning strict mode off? > > --------------000103010804020307030708 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit Hi Edward,

That's possibly due to using unix_timestamp (although the error message seems misleading if that proves true) . It's technically correct it shouldn't be flagged as deterministic because every time you call it you'll get a different answer as time progresses. However reality is I just want it called 1 time which is during planning and if I flag is as deterministic this is exactly what happens so you can do this:

@UDFType(deterministic = true)
public class UnixTimeStamp extends GenericUDFUnixTimeStamp {
    // Making the udf deterministic which is kind of cheating but makes partition pruning work.
}

And then register the udf like you normally would.

If that's not helping do some creative sub-querying might help like
FROM (select * from entry_hourly_v3 where dt=2015101517 ) entry_hourly_v3 INNER JOIN article_meta ON

Bennie.

On 15/10/15 23:06, Edward Capriolo wrote:
So I have strict mode on and I like to keep it that way.

I am trying to do this query.

INSERT OVERWRITE TABLE vertical_stats_recent PARTITION (dt=2015101517)
SELECT ...
  
FROM entry_hourly_v3 INNER JOIN article_meta ON 
entry_hourly_v3.entry_id = article_meta.entry_id 
INNER JOIN channel_meta ON
channel_meta.section_name = article_meta.channel

WHERE entry_hourly_v3.dt=2015101517
AND article_meta.dt=2015101517
AND channel_meta.hitdate=20151015 
AND article_meta.publish_timestamp > ((unix_timestamp() * 1000) - (1000 * 60 * 60 * 24 * 2))
GROUP 

entry_hourly_v3, channel_meta and article_meta are partitioned tables. 

Your query has the following error(s):

Error while compiling statement: FAILED: SemanticException [Error 10041]: No partition predicate found for Alias "entry_hourly_v3" Table "entry_hourly_v3"

I also tried putting views on the table and I had no luck.

Is there any way I can do this query without turning strict mode off?



--------------000103010804020307030708--