hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <>
Subject [jira] [Commented] (HIVE-8720) Update orc_merge tests to make it consistent across OS'es
Date Tue, 04 Nov 2014 06:59:34 GMT


Hive QA commented on HIVE-8720:

{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:

{color:green}SUCCESS:{color} +1 6668 tests passed

Test results:
Console output:
Test logs:

Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase

This message is automatically generated.

ATTACHMENT ID: 12679137 - PreCommit-HIVE-TRUNK-Build

> Update orc_merge tests to make it consistent across OS'es
> ---------------------------------------------------------
>                 Key: HIVE-8720
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Prasanth J
>            Assignee: Prasanth J
>         Attachments: HIVE-8720.1.patch, orc_merge5_filedump_macosx.txt, orc_merge5_filedump_opensuse.txt
> orc_merge*.q test cases fails with qfile diffs related to file size on different OSes.
I have seen failures with Open SUSE and CentOS. The order of insertion of rows into ORC table
impacts the file size because of run length encoding. Since the order of rows is not guaranteed
during insertion into table we may get different file sizes. We cannot add ORDER BY to insert
queries as it will force insertion through single reducer which will disable orc merge file
optimization. Since these test cases test if the files are merged or not it is sufficient
to know the number of files after merging. Instead of DESCRIBE FORMATTED (which shows the
numFiles and fileSize) we can use "dfs -ls" to know the number of files.

This message was sent by Atlassian JIRA

View raw message