hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Hooker <mwhoo...@disqus.com>
Subject Re: Need help running query over "yesterday". getting "No partition predicate found" errors
Date Tue, 05 Jun 2012 23:36:36 GMT
Thanks, Mark.

I was hoping the expressions could be evaluated at query time, but
generating the query works just as well.

On Mon, Jun 4, 2012 at 7:42 PM, Mark Grover <mgrover@oanda.com> wrote:

> Hi Matthew,
> unix_timestamp() is being executed on the Hive server in your case.
> Therefore, as your query progresses, the timestamp returned by your
> unix_timestamp() would change. Based on your usage, this would lead to
> unsuspecting problems when this queries runs across UTC midnight. In any
> case, I'd recommend that you get your Hive client code to get the present
> unix_timestamp and generate your partition pruning constant(s) on the
> client before you send your query to the server. When you do so, partition
> pruning will occur as expected.
>
> Mark
>
> ----- Original Message -----
> From: "Matthew Hooker" <mwhooker@disqus.com>
> To: user@hive.apache.org
> Sent: Monday, June 4, 2012 8:01:26 PM
> Subject: Need help running query over "yesterday". getting "No partition
> predicate found" errors
>
> Hello,
>
>
> I'm trying to construct a query which will do some simple counts over a
> table of events, and insert them in to a summary table.
>
>
> The query I came up with looks something like
>
>
>
> INSERT OVERWRITE TABLE activity
> PARTITION(date_utc)
> select count(1) views,
> from impressions
>
> where dt >= date_sub(to_date(from_unixtime(unix_timestamp())),1)
> and dt < to_date(from_unixtime(unix_timestamp()))
> group by d_theme;
>
>
> This seemed to work fine, but I realized it was doing a full table scan,
> and not using the partitions in "views" appropriately.
>
>
> I set hive.mapred.mode=strict and the above query fails to run.
> If I change the expressions in the where clause to literals, it works as I
> would expect.
>
>
> Can anyone help me figure out how to do what I want?
>
>
>
>
> I put together a test script to illustrate my problem:
>
>
>
> set hive.mapred.mode=strict;
> set hive.optimize.ppd=true;
> DROP TABLE IF EXISTS test_where_expr;
> CREATE TABLE test_where_expr (
> ts int
> )
> PARTITIONED BY ( dt STRING );
> explain SELECT * from test_where_expr WHERE dt =
> to_date(date_sub(from_unixtime(unix_timestamp()),1));
> explain SELECT * from test_where_expr WHERE dt = '2012-06-01';
>
>
>
>
> This is the output:
>
>
>
> hive> set hive.mapred.mode=strict;
> hive> set hive.optimize.ppd=true;
> hive>
> > DROP TABLE IF EXISTS test_where_expr;
> OK
> Time taken: 3.405 seconds
> hive>
> > CREATE TABLE test_where_expr (
> > ts int
> > )
> > PARTITIONED BY ( dt STRING );
> OK
> Time taken: 0.189 seconds
> hive>
> > explain SELECT * from test_where_expr
> > WHERE dt = to_date(date_sub(from_unixtime(unix_timestamp()),1));
> FAILED: Error in semantic analysis: No partition predicate found for Alias
> "test_where_expr" Table "test_where_expr"
> hive>
> > explain SELECT * from test_where_expr
> > WHERE dt = '2012-06-01';
> OK
> ABSTRACT SYNTAX TREE:
> (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME test_where_expr)))
> (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT
> (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (= (TOK_TABLE_OR_COL dt)
> '2012-06-01'))))
>
>
> STAGE DEPENDENCIES:
> Stage-0 is a root stage
>
>
> STAGE PLANS:
> Stage: Stage-0
> Fetch Operator
> limit: -1
>
>
>
>
> Time taken: 0.166 seconds
> hive> hadoop@ip-10-68-190-136:~$ hive --version
> Hive version 0.8.1.
> hadoop@ip-10-68-190-136:~$ hadoop -version
> java version "1.6.0_26"
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>
>
>
>
>
> Thanks,
> --Matthew Hooker
>

Mime
View raw message