orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From alanfgates <...@git.apache.org>
Subject [GitHub] orc pull request #179: ORC-255
Date Mon, 16 Oct 2017 22:40:12 GMT
GitHub user alanfgates opened a pull request:



    This is not ready for commit.  I'm just putting it up so people can start looking at it
and giving feedback.
    As noted in the JIRA, this only deals with ACID2 and the vector batch interface.
    This depends on an unreleased version of Hive's storage-api.  It also fails when running
TestRecordReaderImpl due to changes in storage-api's DiskRangeList.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/alanfgates/orc orc255

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #179
commit 96026a342bf531c9c12b3cc8a127f33026cba6b9
Author: Alan Gates <gates@hortonworks.com>
Date:   2017-09-15T18:40:01Z

    WIP Ported parsing parts of Hive's AcidUtils into AcidDirectoryParser and supporting classes.
 Haven't finished the testing yet.

commit 12477e216caee814fd3c6545a3a7c938d54369b8
Author: Alan Gates <gates@hortonworks.com>
Date:   2017-09-26T22:57:50Z

    Finished testing AcidDirectoryParser.

commit 096072c6c6628f1bcee4ec931ec785e136e11e23
Author: Alan Gates <gates@hortonworks.com>
Date:   2017-09-27T00:15:28Z

    Changed AcidVersionedDirectory to track txn information for files in addition to just

commit df66d52047938c239948bd04559c95d4fcac2227
Author: Alan Gates <gates@hortonworks.com>
Date:   2017-09-27T22:45:46Z

    Moved AcidVersionedDirectory to ParsedAcidDirectory to better fit with terminology of
AcidDirectoryParser and ParsedAcidFile.  Added ability to determine whether a given input
file from the directory should be read and to determine which delete deltas to use for a given
input file.  Fixed a number of bugs I found along the way.

commit 111e1308a0cbf79862e80c97f5c6ca9c78b38273
Author: Alan Gates <gates@hortonworks.com>
Date:   2017-09-28T20:00:12Z

    Added ability to read insert files (base and normal delta).  Haven't yet done delete files.

commit b8a7e6d7da40e83d193140d565463caf83379ee1
Author: Alan Gates <gates@hortonworks.com>
Date:   2017-09-30T00:59:09Z

    WIP, wrote the initial code for handling the deletes.  Haven't tested it yet.

commit 9146dd6020d63694e0b5773b2f092c102e78b0da
Author: Alan Gates <gates@hortonworks.com>
Date:   2017-10-03T19:54:09Z

    Fixed a bunch of errors in delete handling.  Added unit tests for delete testing.

commit 90ff039b83c2a198b5b7117b8c554c989a374af7
Author: Alan Gates <gates@hortonworks.com>
Date:   2017-10-04T23:37:25Z

    Went overboard on caching delete sets.  I'm going to simplify this a bunch and remove
the caching.  But checking in now in case I change my mind and decide to go back to the caching.

commit acaabe6272e57e2bce0c9af5f74d61a2e1510709
Author: Alan Gates <gates@hortonworks.com>
Date:   2017-10-05T00:53:30Z

    Simplified delete sets to be attached to a ParsedAcidDirectory instead of trying to cache
them.  That leaves it up to the user to make sure there aren't too many ParsedAcidDirectories
live in a process, each with its own DeleteSet.

commit ed77b1e89a390c2c451b821a84f4a76595ad3cda
Author: Alan Gates <gates@hortonworks.com>
Date:   2017-10-12T00:04:23Z

    Most likely useless changes.  I don't think I need the MergingAcidRecordReader.  But keeping
it for now in case I turn out to be wrong.  It has happened before.



View raw message