hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-7239) Fix bug in HiveIndexedInputFormat implementation that causes incorrect query result when input backed by Sequence/RC files
Date Tue, 17 Jun 2014 04:49:02 GMT

    [ https://issues.apache.org/jira/browse/HIVE-7239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033432#comment-14033432
] 

Hive QA commented on HIVE-7239:
-------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12650622/HIVE-7239.patch

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 5611 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_columnar
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_ctas
org.apache.hadoop.hive.ql.exec.tez.TestTezTask.testSubmit
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/487/testReport
Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/487/console
Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-487/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12650622

> Fix bug in HiveIndexedInputFormat implementation that causes incorrect query result when
input backed by Sequence/RC files
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-7239
>                 URL: https://issues.apache.org/jira/browse/HIVE-7239
>             Project: Hive
>          Issue Type: Bug
>          Components: Indexing
>    Affects Versions: 0.13.1
>            Reporter: Sumit Kumar
>            Assignee: Sumit Kumar
>         Attachments: HIVE-7239.patch
>
>
> In case of sequence files, it's crucial that splits are calculated around the boundaries
enforced by the input sequence file. However by default hadoop creates input splits depending
on the configuration parameters which may not match the boundaries for the input sequence
file. Hive provides HiveIndexedInputFormat that provides extra logic and recalculates the
split boundaries for each split depending on the sequence file's boundaries.
> However we noticed this behavior of "over" reporting from data backed by sequence file.
We've a sample data on which we experimented and fixed this bug, we have verified this fix
by comparing the query output for input being sequence file format, rc file and regular format.
However we have not able to find the right place to include this as a unit test that would
execute as part of hive tests. We tried writing a "clientpositive" test as part of ql module
but the output seems quite verbose and i couldn't interpret it that well. Can someone please
review this change and guide on how to write a test that will execute as part of Hive testing?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message