hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-6198) ORC file and struct column names are case sensitive
Date Wed, 09 Jul 2014 10:12:04 GMT

    [ https://issues.apache.org/jira/browse/HIVE-6198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056060#comment-14056060
] 

Hive QA commented on HIVE-6198:
-------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12654773/HIVE-6198.2.patch.txt

{color:red}ERROR:{color} -1 due to 98 failed/errored test(s), 5701 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_analyze_table_null_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_groupby
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_limit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_union
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_decimal
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_error_message
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_schema_literal
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_case_sensitivity
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_udf1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnarserde_create_shortcut
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl_dp
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_binary
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_boolean
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_decimal
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_double
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_empty_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_long
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constant_prop
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_convert_enum_to_string
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_xpath
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_display_colstats_tbllvl
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_distinct_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_columnarserde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_dynamicserde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_lazyserde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_testxpath
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_testxpath2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_testxpath3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_testxpath4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_inputddl8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_thrift
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadata_only_queries
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadata_only_queries_with_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_create
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_create
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_serde_reported_schema
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_invalidation
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_noscan_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_only_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_statsfs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_case_thrift
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_coalesce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_isnull_isnotnull
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_size
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union21
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_varchar_udf1
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver_udaf_example_avg
org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver_udf_example_arraymapstruct
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_stats
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadata_only_queries
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_stats_counter
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter_partitioned
org.apache.hadoop.hive.ql.parse.TestParse.testParse_case_sensitivity
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath2
org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeMapWithNullablePrimitiveValues
org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeMapsWithPrimitiveKeys
org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeRecords
org.apache.hadoop.hive.serde2.avro.TestAvroDeserializer.canDeserializeUnions
org.apache.hadoop.hive.serde2.avro.TestAvroObjectInspectorGenerator.primitiveTypesWorkCorrectly
org.apache.hadoop.hive.serde2.avro.TestAvroSerde.badSchemaURLProvidedReturnsErrorSchema
org.apache.hadoop.hive.serde2.avro.TestAvroSerde.bothPropertiesSetToNoneReturnsErrorSchema
org.apache.hadoop.hive.serde2.avro.TestAvroSerde.emptySchemaProvidedReturnsErrorSchema
org.apache.hadoop.hive.serde2.avro.TestAvroSerde.emptySchemaURLProvidedReturnsErrorSchema
org.apache.hadoop.hive.serde2.avro.TestAvroSerde.gibberishSchemaProvidedReturnsErrorSchema
org.apache.hadoop.hive.serde2.avro.TestAvroSerde.noSchemaProvidedReturnsErrorSchema
org.apache.hadoop.hive.serde2.objectinspector.TestStandardObjectInspectors.testStandardStructObjectInspector
org.apache.hadoop.hive.serde2.objectinspector.TestStandardObjectInspectors.testStandardUnionObjectInspector
org.apache.hadoop.hive.serde2.objectinspector.TestUnionStructObjectInspector.testUnionStructObjectInspector
org.apache.hive.hcatalog.common.TestHCatUtil.testGetTableSchemaWithPtnColsSerDeReportedFields
org.apache.hive.hcatalog.mapreduce.TestHCatHiveThriftCompatibility.testDynamicCols
org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/721/testReport
Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/721/console
Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-721/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 98 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12654773

> ORC file and struct column names are case sensitive
> ---------------------------------------------------
>
>                 Key: HIVE-6198
>                 URL: https://issues.apache.org/jira/browse/HIVE-6198
>             Project: Hive
>          Issue Type: Bug
>          Components: CLI, File Formats
>    Affects Versions: 0.11.0, 0.12.0
>            Reporter: Viraj Bhat
>            Assignee: Navis
>         Attachments: HIVE-6198.1.patch.txt, HIVE-6198.2.patch.txt
>
>
> HiveQL document states that the "Table names and column names are case insensitive".
But the struct behavior for ORC file is different. 
> Consider a sample text file:
> {code}
> $ cat data.txt
> line1|key11:value11,key12:value12,key13:value13|a,b,c|one,two
> line2|key21:value21,key22:value22,key23:value23|d,e,f|three,four
> line3|key31:value31,key32:value32,key33:value33|g,h,i|five,six
> {code}
> Creating a table stored as txt and then using this to create a table stored as orc 
> {code}
> CREATE TABLE orig (
>   str STRING,
>   mp  MAP<STRING,STRING>,
>   lst ARRAY<STRING>,
>   strct STRUCT<A:STRING,B:STRING>
> ) ROW FORMAT DELIMITED
>     FIELDS TERMINATED BY '|'
>     COLLECTION ITEMS TERMINATED BY ','
>     MAP KEYS TERMINATED BY ':';
> LOAD DATA LOCAL 'test.txt' INTO TABLE orig;
> CREATE TABLE tableorc (
>   str STRING,
>   mp  MAP<STRING,STRING>,
>   lst ARRAY<STRING>,
>   strct STRUCT<A:STRING,B:STRING>
> ) STORED AS ORC;
> INSERT OVERWRITE TABLE tableorc SELECT * FROM orig;
> {code}
> Suppose we project columns or read the *strct* columns for both table types, here are
the results. I have also tested the same with *RC*. The behavior is similar to *txt* files.
> {code}
> hive> SELECT * FROM orig;
> line1   {"key11":"value11","key12":"value12","key13":"value13"} ["a","b","c"]  
> {"a":"one","b":"two"}
> line2   {"key21":"value21","key22":"value22","key23":"value23"} ["d","e","f"]  
> {"a":"three","b":"four"}
> line3   {"key31":"value31","key32":"value32","key33":"value33"} ["g","h","i"]  
> {"a":"five","b":"six"}
> Time taken: 0.126 seconds, Fetched: 3 row(s)
> hive> SELECT * FROM tableorc;
> line1   {"key12":"value12","key11":"value11","key13":"value13"} ["a","b","c"]  
> {"A":"one","B":"two"}
> line2   {"key21":"value21","key23":"value23","key22":"value22"} ["d","e","f"]  
> {"A":"three","B":"four"}
> line3   {"key33":"value33","key31":"value31","key32":"value32"} ["g","h","i"]  
> {"A":"five","B":"six"}
> Time taken: 0.178 seconds, Fetched: 3 row(s)
> hive> SELECT strct FROM tableorc;
> {"a":"one","b":"two"}
> {"a":"three","b":"four"}
> {"a":"five","b":"six"}
> hive>SELECT strct.A FROM orig;
> one
> three
> five
> hive>SELECT strct.a FROM orig;
> one
> three
> five
> hive>SELECT strct.A FROM tableorc;
> one
> three
> five
> hive>SELECT strct.a FROM tableorc;
> FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> MapReduce Jobs Launched: 
> Job 0: Map: 1   HDFS Read: 0 HDFS Write: 0 FAIL
> {code}
> So it seems that ORC behaves differently for struct columns. Also why are we storing
the column names for struct for the other types as CASE SENSITIVE? What is the standard for
Hive QL with respect to structs?
> Regards
> Viraj



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message