impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Armstrong <>
Subject Re: [Ready for Review] IMPALA-5717: Support reading from ORC format files
Date Fri, 26 Jan 2018 17:02:03 GMT
Thank you!

I had few higher-level questions or thoughts:

* Assuming we end up using the ORC C++ library, we probably want to manage
it in the same way that we do Avro by building it externally and then
linking against it (we use the native-toolchain project for convenience).
Importing the code seems OK at least for the initial review though.
* It sucks that the ORC library can't handle allocation failures
gracefully. I need to think about it more. One option is to contribute
fixes back to the ORC project.
* This is definitely going to conflict with (use reservations for scans). I don't
know how deterministic the ORC readers memory requirements are, but if it
assumes it can allocate an arbitrary amount of memory, then it may be
problematic. I'll have to think about this. I don't necessarily think you
should have to do the work to switch the ORC scanner to the new paradigm.
One would be to merge the ORC reader to a branch post-code-review and then
I could do the necessary work to switch it to the new paradigm (since I've
already done it for all the other scanners).

On Fri, Jan 26, 2018 at 3:46 AM, Quanlong Huang <>

> Hi friends,
> I'm very excited that our ORC-support patch has passed all the tests and
> is ready for review! To ease your work, we wrote a brief document about our
> solution:
> ZbmMf6cD8YJq4x2tM0UXYPyzf0AYqe6Gc
> In short, we integrated the mature orc-reader (
> orc/tree/master/c%2B%2B) into Impala. As a first step, we only support
> reading primitive types.
> Finally, here is the review link:
> Hope this feature can be accepted! Any feedback is welcome!
> Thanks,
> Quanlong Huang
> Hulu

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message