hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergio Peña (JIRA) <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-11440) Create Parquet predicate push down (PPD) unit tests and q-tests
Date Wed, 05 Aug 2015 20:37:06 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14658350#comment-14658350
] 

Sergio Peña edited comment on HIVE-11440 at 8/5/15 8:37 PM:
------------------------------------------------------------

Thanks [~Ferd]
Here's some things we should add to the tests.

- Can we set 'hive.optimize.ppd=true' in all .q tests? We found many issues with that flag
enabled.

- predicate_ppd_boolean.q
	* I see some expressions are not tested with PPD, such as b=true, b!=true, b<true, b<=true.
	  Can we test the same query twice? one with filter=true and other with filter=false.
	  
- I see double/string/tinyint/smallint type are tested on parquet_predicate_pushdown.q.
  What about other types like float/int/bigint? Should we create new parquet_ppd_*.q files
  for each new type is not covered?

Also, can you add JUnit tests to the classes and methods that creates the PPD?


was (Author: spena):
Thanks [~Ferd]
Here's some things we should add to the tests.

- Can we set 'hive.optimize.ppd=true' in all .q tests? We found many issues with that flag
enabled.

- predicate_ppd_boolean.q
	* I see some expressions are not tested with PPD, such as b=true, b!=true, b<true, b<=true.
	  Can we test the same query twice? one with filter=true and other with filter=false.
	  
- I see double/string/tinyint/smallint type are tested on parquet_predicate_pushdown.q.
  What about other types like float/int/bigint? Should we create new parquet_ppd_*.q files
  for each new type is not covered?

> Create Parquet predicate push down (PPD) unit tests and q-tests
> ---------------------------------------------------------------
>
>                 Key: HIVE-11440
>                 URL: https://issues.apache.org/jira/browse/HIVE-11440
>             Project: Hive
>          Issue Type: Test
>    Affects Versions: 2.0.0
>            Reporter: Sergio Peña
>            Assignee: Ferdinand Xu
>         Attachments: HIVE-11440.patch
>
>
> The current code of Hive and Parquet do not have enough unit and integration tests to
validate and verify the correct behavior of Parquet PPD.
> We should add more tests to cover all Parquet PPD functionalities. This includes:
> - Predicate with all simple types that Parquet supports.
> - Predicate with nested types if Hive/parquet supports them
> - Predicate with partitioned columns.
>   (parquet_predicate_pushdown.q has just one test for this).
> If bugs are found during the tests, then create other JIRAS for tracking individual bugs
there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message