hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szehon Ho (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-7286) Parameterize HCatMapReduceTest for testing against all Hive storage formats
Date Fri, 27 Jun 2014 18:52:26 GMT

    [ https://issues.apache.org/jira/browse/HIVE-7286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046268#comment-14046268
] 

Szehon Ho commented on HIVE-7286:
---------------------------------

Yea it seems worthwhile to get HIVE-5976 in, thanks for your help in that.

Sorry for late response.  Those Serdes in SERDEUSINGMETASTOREFORSCHEMA use the native hive
metadata to determine schemas, instead of ones like avro that specify it outside.  Hence those
can easily plug into HCatMapReduceTest via params, as it creates table using native hive metadata.
 But I'm personally not that eager to force other Serde to plugin to the test, as you had
to write lengthy schema-conversion code for avro to do that, that is test-only code and a
burden to maintain as there's no real use-case elsewhere for that.  I think its wonderful
if a test framework can automatically generate tests for new serdes, but I don't think it
should enforce this unnecessary work on new-serde devs, as test-coverage can be achieved in
more natural ways.  

Hence, would it make sense to just automate/enforce paremeterization of the test for serde's
in SERDEUSINGMETASTOREFORSCHEMA, and handle other serdes like avro as a one-off?

> Parameterize HCatMapReduceTest for testing against all Hive storage formats
> ---------------------------------------------------------------------------
>
>                 Key: HIVE-7286
>                 URL: https://issues.apache.org/jira/browse/HIVE-7286
>             Project: Hive
>          Issue Type: Test
>          Components: HCatalog
>            Reporter: David Chen
>            Assignee: David Chen
>         Attachments: HIVE-7286.1.patch
>
>
> Currently, HCatMapReduceTest, which is extended by the following test suites:
>  * TestHCatDynamicPartitioned
>  * TestHCatNonPartitioned
>  * TestHCatPartitioned
>  * TestHCatExternalDynamicPartitioned
>  * TestHCatExternalNonPartitioned
>  * TestHCatExternalPartitioned
>  * TestHCatMutableDynamicPartitioned
>  * TestHCatMutableNonPartitioned
>  * TestHCatMutablePartitioned
> These tests run against RCFile. Currently, only TestHCatDynamicPartitioned is run against
any other storage format (ORC).
> Ideally, HCatalog should be tested against all storage formats supported by Hive. The
easiest way to accomplish this is to turn HCatMapReduceTest into a parameterized test fixture
that enumerates all Hive storage formats. Until HIVE-5976 is implemented, we would need to
manually create the mapping of SerDe to InputFormat and OutputFormat. This way, we can explicitly
keep track of which storage formats currently work with HCatalog or which ones are untested
or have test failures. The test fixture should also use Reflection to find all classes in
the classpath that implements the SerDe interface and raise a failure if any of them are not
enumerated.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message