impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: [Ready for Review] IMPALA-5717: Support reading from ORC format files
Date Fri, 26 Jan 2018 18:25:48 GMT
On Fri, Jan 26, 2018 at 12:02 PM, Tim Armstrong <tarmstrong@cloudera.com>
wrote:

> Thank you!
>
> I had few higher-level questions or thoughts:
>
> * Assuming we end up using the ORC C++ library, we probably want to manage
> it in the same way that we do Avro by building it externally and then
> linking against it (we use the native-toolchain project for convenience).
> Importing the code seems OK at least for the initial review though.
> * It sucks that the ORC library can't handle allocation failures
> gracefully. I need to think about it more. One option is to contribute
> fixes back to the ORC project.
> * This is definitely going to conflict with
> https://gerrit.cloudera.org/#/c/8966/ (use reservations for scans). I
> don't
> know how deterministic the ORC readers memory requirements are, but if it
> assumes it can allocate an arbitrary amount of memory, then it may be
> problematic. I'll have to think about this. I don't necessarily think you
> should have to do the work to switch the ORC scanner to the new paradigm.
> One would be to merge the ORC reader to a branch post-code-review and then
> I could do the necessary work to switch it to the new paradigm (since I've
> already done it for all the other scanners).
>
> On Fri, Jan 26, 2018 at 3:46 AM, Quanlong Huang <huang_quanlong@126.com>
> wrote:
>
> > Hi friends,
> >
> > I'm very excited that our ORC-support patch has passed all the tests and
> > is ready for review! To ease your work, we wrote a brief document about
> our
> > solution: https://docs.google.com/document/d/1Lg-MmZIis-
> > ZbmMf6cD8YJq4x2tM0UXYPyzf0AYqe6Gc
> >
> > In short, we integrated the mature orc-reader (
> https://github.com/apache/
> > orc/tree/master/c%2B%2B) into Impala. As a first step, we only support
> > reading primitive types.
> >
> > Finally, here is the review link: https://gerrit.cloudera.org/#/c/9134/
> >
> > Hope this feature can be accepted! Any feedback is welcome!
> >
> > Thanks,
> > Quanlong Huang
> > Hulu
> >
> >
> >
> >
>

This is really awesome and important work. This will help tear down huge
wall wedged between Hive and Impala. The wall helps no one and in the end
only drives users to seek out other solutions (aka cloud solutions like
BigQuery, Athena, Redshift)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message