hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC
Date Wed, 30 Mar 2016 14:46:25 GMT

    [ https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218069#comment-15218069
] 

Hive QA commented on HIVE-9660:
-------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12795693/HIVE-9660.01.patch

{color:green}SUCCESS:{color} +1 due to 8 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 121 failed/errored test(s), 9891 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a
TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
- did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not
produce a TEST-*.xml file
TestSparkCliDriver-parallel_join0.q-union_remove_9.q-smb_mapjoin_21.q-and-12-more - did not
produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml
file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_globallimit
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnStatsUpdateForStatsOptimizer_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_full
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_partial
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_extrapolate_part_stats_partial_ndv
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_analyze
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_file_dump
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_llap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_merge12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_fast_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_llap_nullscan
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join1
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join2
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join3
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join4
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_vector_outer_join5
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_llap_nullscan
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_analyze
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge10
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge11
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_merge12
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_stats
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union_fast_stats
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_varchar_simple
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.ql.TestTxnCommands2.writeBetweenWorkerAndCleaner
org.apache.hadoop.hive.ql.io.orc.TestColumnStatistics.testHasNull
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testBloomFilter2
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDataDump
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold
org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testCombinationInputFormatWithAcid
org.apache.hadoop.hive.ql.io.orc.TestJsonFileDump.testJsonDump
org.apache.hadoop.hive.ql.io.orc.TestNewIntegerEncoding.testBasicRow[0]
org.apache.hadoop.hive.ql.io.orc.TestNewIntegerEncoding.testBasicRow[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.columnProjection[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.columnProjection[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.metaData[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.metaData[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.test1[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.test1[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testBitPack64Large[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testBitPack64Large[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testDate1900[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testDate2038[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testHiveDecimalAllNulls[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testHiveDecimalAllNulls[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testHiveDecimalIsNullReset[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testPredicatePushdown[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testPredicatePushdown[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testSeek[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testSnappy[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testSnappy[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testStringAndBinaryStatistics[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testStringAndBinaryStatistics[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testStripeLevelStats[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testStripeLevelStats[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testTimestamp[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testTimestamp[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testUnionAndTimestamp[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testUnionAndTimestamp[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testWithoutIndex[0]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testWithoutIndex[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcFile.testZeroCopySeek[1]
org.apache.hadoop.hive.ql.io.orc.TestOrcNullOptimization.testColumnsWithNullAndCompression
org.apache.hadoop.hive.ql.io.orc.TestOrcNullOptimization.testMultiStripeWithNull
org.apache.hadoop.hive.ql.io.orc.TestOrcNullOptimization.testMultiStripeWithoutNull
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta
org.apache.hadoop.hive.ql.io.orc.TestOrcRawRecordMerger.testRecordReaderOldBaseAndDelta
org.apache.hadoop.hive.ql.io.orc.TestOrcSplitElimination.testExternalFooterCache
org.apache.hadoop.hive.ql.io.orc.TestOrcSplitElimination.testExternalFooterCachePpd
org.apache.hadoop.hive.ql.io.orc.TestOrcSplitElimination.testSplitEliminationComplexExpr
org.apache.hadoop.hive.ql.io.orc.TestOrcSplitElimination.testSplitEliminationLargeMaxSplit
org.apache.hadoop.hive.ql.io.orc.TestOrcSplitElimination.testSplitEliminationSmallMaxSplit
org.apache.hadoop.hive.ql.io.orc.TestStringDictionary.testTooManyDistinctCheckDisabled
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.metaData
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.test1
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testColumnProjection
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testDate1900
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testDate2038
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testLists
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testMaps
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testNonDictionaryRepeatingString
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testPredicatePushdown
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testRepeating
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testSeek
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testSnappy
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testStringAndBinaryStatistics
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testStringPadding
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testStripeLevelStats
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testStructs
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testTimestamp
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testUnionAndTimestamp
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testUnions
org.apache.hadoop.hive.ql.io.orc.TestVectorOrcFile.testWithoutIndex
{noformat}

Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7411/testReport
Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7411/console
Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7411/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 121 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12795693 - PreCommit-HIVE-TRUNK-Build

> store end offset of compressed data for RG in RowIndex in ORC
> -------------------------------------------------------------
>
>                 Key: HIVE-9660
>                 URL: https://issues.apache.org/jira/browse/HIVE-9660
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-9660.01.patch, HIVE-9660.patch, HIVE-9660.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of extra data
being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of compressed
buffers for each RG, or end offset, or something, to remove this estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message