orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <gop...@apache.org>
Subject Re: Thoughts on Acid reader
Date Thu, 14 Sep 2017 03:47:43 GMT
> The first thing that strikes me is that createReader takes a file.  
> But for acid, you need to pass the directory because it needs to look for any relevant
delta files.  

The ACID 2.x impl, the InputFormat gets a directory - but a Reader should still be getting
an individual file.

In fact, it should be getting something smaller than a File (ideally an HDFS block) which
is encoded as FileSplit (path, offset, len) and a ValidTxnList.

In the original ACID impl, multiple delta files get a single Reader, while in the new ACID
2.x impl the concept of a "base file + deltas" is irrelevant. 

All insert deltas are equivalent base files - the base concept is (1 Stripe) + (Relevant deletes)
== 1 vector reader.

There's no need to know which delete deltas were already read unlike the UPDATE ops (i.e Split
#1 and Split #2 can load their deletes independently, without worrying about double row outputs).

If a delete delta which was loaded is not found in the input split, it has no effect on the
reader's output correctness.

> I don’t like that the user has to make a different call in the acid case.
 
You need to identify within ORC whether the file provided is a base file or a delta insert
file.

If the file is a base_xx, then all deletes exist in ./delete_deltas_*/

if the file is named delta_xx, then all deletes exist in ../delete_deltas*/

Those are strictly enforced by the ACID implementation & can serve as easy assumptions.

Other than that, a non-null ValidTxnList is all it should take.

>   the user will already have to have split logic.

The part where the logic-splits off is into InputFormat - detecting compaction during split
generation is strictly the InputFormat's problem.

There's a bit of magic there which is in plain sight, like how the INSERT OVERWRITE works
transactionally (HIVE-14988).

For me, the clear division is to look at this problem as "Details about file names" (includes
 HIVE-14535) and "Details about a Stripe" (Reader + valid-txns + deletes application).

Everything in the middle is just the same as regular ORC, like PPD.

>From the other side of the mirror, the flat ORC API is pretty much a Null ROW__ID pruned
already, with 0 deletes and Long.MAX watermark in the ReaderOptions.

> implementation of Reader and RecordReader that understand acid

There's an "*" to most of the above - a reader which intends to modify the data might need
a different API, to be explicit that the ROW__ID is projected out  to the user.

Cheers,
Gopal



On 9/13/17, 3:48 PM, "Alan Gates" <alanfgates@gmail.com> wrote:

    I’ve been looking at the OrcFile.createReader method and thinking about
    what I will need to do to read acid files.  The first thing that strikes me
    is that createReader takes a file.  But for acid, you need to pass the
    directory because it needs to look for any relevant delta files.  Acid also
    requires a ValidTxnList.  We can add that to the ReaderOptions.
    
    It seems the best way to do this is to add a new method
    OrcFile.createAcidReader that takes a directory.  I don’t like that the
    user has to make a different call in the acid case.  But the user will have
    to set the ValidTxnList in the reader options anyway, so the user will
    already have to have split logic.
    
    Every way I could think of for createReader to decide if it was dealing
    with an acid directory or a non-acid file seemed to create jumbled
    semantics.
    Does the user pass a directory for the acid case but a file for non-acid?
    Yuck.
    Does the user pass a base file in the acid case and the code walks up the
    path to find the relevant directory?  Seems error prone and slow.
    
    Related to this is my assumption that I will need to write a new
    implementation of Reader and RecordReader that understand acid.  This seems
    better than putting a bunch of branches into the existing code to try to
    handle both cases.
    
    Thoughts?
    
    Alan.
    



Mime
View raw message