hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From saucam <...@git.apache.org>
Subject [GitHub] hive pull request: HIVE-10296: Remove cast since it might result i...
Date Fri, 10 Apr 2015 05:54:29 GMT
GitHub user saucam opened a pull request:

    https://github.com/apache/hive/pull/32

    HIVE-10296: Remove cast since it might result in exceptions when filter conditions are
pushed into joins in postgres

    Try to drop a partition from hive:
    ALTER TABLE f__edr_bin_source__900_sub_id DROP IF EXISTS PARTITION ( exporttimestamp=1427824800,
timestamp=1427824800)
    This triggers a query on the metastore like this :
    "select "PARTITIONS"."PART_ID" from "PARTITIONS" inner join "TBLS" on "PARTITIONS"."TBL_ID"
= "TBLS"."TBL_ID" and "TBLS"."TBL_NAME" = ? inner join "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"
and "DBS"."NAME" = ? inner join "PARTITION_KEY_VALS" "FILTER0" on "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID"
and "FILTER0"."INTEGER_IDX" = 0 inner join "PARTITION_KEY_VALS" "FILTER1" on "FILTER1"."PART_ID"
= "PARTITIONS"."PART_ID" and "FILTER1"."INTEGER_IDX" = 1 where ( (((case when "TBLS"."TBL_NAME"
= ? and "DBS"."NAME" = ? then cast("FILTER0"."PART_KEY_VAL" as decimal(21,0)) else null end)
= ?) and ((case when "TBLS"."TBL_NAME" = ? and "DBS"."NAME" = ? then cast("FILTER1"."PART_KEY_VAL"
as decimal(21,0)) else null end) = ?)) )"
    In some cases, when the internal tables in postgres (metastore) have some amount of data,
the query plan pushes the condition down into the join.
    Now because of DERBY-6358 , case when clause is used before the cast, but in this case
, cast is evaluated before condition being evaluated. So in case we have different tables
partitioned on string and integer columns, cast exception is observed!
    15/04/06 08:41:20 ERROR metastore.ObjectStore: Direct SQL failed, falling back to ORM

    javax.jdo.JDODataStoreException: Error executing SQL query "select "PARTITIONS"."PART_ID"
from "PARTITIONS" inner join "TBLS" on "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID" and "TBLS"."TBL_NAME"
= ? inner join "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID" and "DBS"."NAME" = ? inner join "PARTITION_KEY_VALS"
"FILTER0" on "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0
inner join "PARTITION_KEY_VALS" "FILTER1" on "FILTER1"."PART_ID" = "PARTITIONS"."PART_ID"
and "FILTER1"."INTEGER_IDX" = 1 where ( (((case when "TBLS"."TBL_NAME" = ? and "DBS"."NAME"
= ? then cast("FILTER0"."PART_KEY_VAL" as decimal(21,0)) else null end) = ?) and ((case when
"TBLS"."TBL_NAME" = ? and "DBS"."NAME" = ? then cast("FILTER1"."PART_KEY_VAL" as decimal(21,0))
else null end) = ?)) )". 
    at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:451)

    at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321) 
    at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:300)

    at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:211)

    at org.apache.hadoop.hive.metastore.ObjectStore$3.getSqlResult(ObjectStore.java:1915)

    at org.apache.hadoop.hive.metastore.ObjectStore$3.getSqlResult(ObjectStore.java:1909)

    at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2208) 
    at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:1909)

    at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:1882)
    org.postgresql.util.PSQLException: ERROR: invalid input syntax for type numeric: "_DEFAULT_BINSRC_"
    15/04/06 08:41:20 INFO metastore.ObjectStore: JDO filter pushdown cannot be used: Filtering
is supported only on partition keys of type string 
    15/04/06 08:41:20 ERROR metastore.ObjectStore: 
    javax.jdo.JDOException: Exception thrown when executing query 
    at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)

    at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:275) 
    at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionNamesNoTxn(ObjectStore.java:1700)

    at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionNamesPrunedByExprNoTxn(ObjectStore.java:2003)

    at org.apache.hadoop.hive.metastore.ObjectStore.access$400(ObjectStore.java:146) 
    at org.apache.hadoop.hive.metastore.ObjectStore$3.getJdoResult(ObjectStore.java:1937)

    at org.apache.hadoop.hive.metastore.ObjectStore$3.getJdoResult(ObjectStore.java:1909)

    at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2214) 
    at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:1909)

    at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:1882)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/saucam/hive HIVE-10296

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/hive/pull/32.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #32
    
----
commit 35ceabecf27c644d7c91051be95b620a2b71ec63
Author: Yash Datta <yash.datta@guavus.com>
Date:   2015-04-10T05:33:50Z

    HIVE-10296: Remove cast since it might result in exceptions when filter conditions are
pushed into joins in postgres

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message