hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-17296) Acid tests with multiple splits
Date Thu, 10 Aug 2017 23:29:00 GMT
Eugene Koifman created HIVE-17296:
-------------------------------------

             Summary: Acid tests with multiple splits
                 Key: HIVE-17296
                 URL: https://issues.apache.org/jira/browse/HIVE-17296
             Project: Hive
          Issue Type: Test
          Components: Transactions
    Affects Versions: 3.0.0
            Reporter: Eugene Koifman
            Assignee: Eugene Koifman
            Priority: Critical


data files in an Acid table are ORC files which may have multiple stripes
for such files in base/ or delta/ (and original files with non acid to acid conversion) are
split by OrcInputFormat into multiple (stripe sized) chunks.
There is additional logic in in OrcRawRecordMerger (discoverKeyBounds/discoverOriginalKeyBounds)
that is not tested by any E2E tests since none of the have enough data to generate multiple
stripes in a single file.

testRecordReaderOldBaseAndDelta/testRecordReaderNewBaseAndDelta/testOriginalReaderPair
in TestOrcRawRecordMerger has some logic to test this but it really needs e2e tests.

With ORC-228 it will be possible to write such tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message