hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thejas Nair <thejas.n...@gmail.com>
Subject Re: Implementing Security for the Thrift Hive Client
Date Tue, 12 Jan 2016 18:12:00 GMT
Regarding complex types, the Hive jdbc driver currently returns them
as JSON strings.
There is a jira to use the complex type support in jdbc spec -
https://issues.apache.org/jira/browse/HIVE-4253. However, that has not
been worked on. Contributions are welcome!


On Fri, Jan 8, 2016 at 8:14 PM, Srinivas M <smudigonda@gmail.com> wrote:
> Thanks for the inputs. It turned out that the scalability and performance
> issues that I ran into are due to the driver. It was a JDBC driver from a
> vendor.
>
> I am reasonably convinced that JDBC might be a better approach. How about
> the support for the complex data types? I do not see any JDBC drivers that
> support any of the complex data types such as Maps, arrays, structs etc...Is
> it in plan for the Apache JDBC drivers?
>
> Regards
>
> On 5 Jan 2016 02:42, "Prasad Mujumdar" <prasadm@apache.org> wrote:
>>
>>   I would strong second Thejas's points. A custom thrift client
>> application
>> won't help you with SQL compatibility and perf/scalability would also be
>> doubtful .. It would be a bit of work to build the client and more
>> importantly to maintain it. Note that the RPC and authentication
>> mechanisms
>> could change in future and you'll have to handle those changes yourself
>> for
>> next upgrade.
>>
>> thanks
>> Prasad
>>
>>
>> On Mon, Jan 4, 2016 at 11:32 AM, Thejas Nair <thejas.nair@gmail.com>
>> wrote:
>>
>> > Can you please elaborate on the scalability/performance issues you
>> > faced ? What kind of scale do you need and what did you see in your
>> > results ?
>> > Why do you feel switching to direct thrift api  is likely to help with
>> > performance ?  The jdbc driver is also using that api. You seem to be
>> > effectively reinventing the wheel by writing another stripped down
>> > jdbc/odbc driver. It would be better to channel your efforts in fixing
>> > the existing jdbc driver, so that you don't end up duplicating a lot
>> > of work.
>> >
>> >
>> > Regarding SQL standards compliance, hive 1.2.1 has made lot of
>> > progress in regarding that aspect. Just like any other SQL database,
>> > hive happens to have many extensions to the standard.
>> >
>> >
>> > On Wed, Dec 30, 2015 at 6:52 PM, Srinivas M <smudigonda@gmail.com>
>> > wrote:
>> > > Thanks Thejas for prompt response.
>> > >
>> > > I should have elaborated the question a bit more. When I said ODBC /
>> > > JDBC
>> > > is not an option, what I mean was, we have existing ODBC / JDBC
>> > > applications and they generate SQL statements adhering to the SQL92
>> > > standard. Since HiveQL is different from it and supports syntax for
>> > > Load
>> > > etc, reuse of the existing ODBC / JDBC applications may not be
>> > > feasible.
>> > > But, for writing a new module specific to the Hive, we can still use
>> > > any
>> > > interface, ODBC / JDBC / Hive Thrift and currently weighing the best
>> > > options.
>> > >
>> > > Here are the reasons for considering Thrift Hive.
>> > >
>> > > 1. While evaluating different JDBC drivers, we have ran into
>> > > scalability
>> > > and performance issues with the JDBC Drivers.
>> > > 2. During the evaluation of the JDBC drivers, ran into situations
>> > > where,
>> > > not all the features that we were looking for are supported by any
>> > > particular driver.
>> > > 3. The expectation was, thrift API would provide better performance
>> > > over
>> > > the JDBC / ODBC.
>> > >
>> > > Need to implement the following capabilities
>> > > 1. Kerberos authentication
>> > > 2. Support to generate native HiveQL and partitioning capabilities.
>> > > 3. Performance is one of the key criteria.
>> > >
>> > > Based on the above, after evaluating some of the ODBC / JDBC drivers,
>> > since
>> > > I have seen some performance issues and functionality issues, I am
>> > > exploring on the Thrift Hive client API with an assumption that some
>> > > of
>> > the
>> > > current shortfalls with the ODBC / JDBC drivers can be addressed by
>> > > using
>> > > the Client API. The initial hiccup that I ran into is with the
>> > > authentication.
>> > >
>> > > Hope this gives enough background.
>> > >
>> > > In case if I can get a better throughput using the JDBC option itself,
>> > then
>> > > I can still consider using the JDBC as opposed to the Thrift Hive.
>> > > Right
>> > > now, evaluating the available options.
>> > >
>> > > Regards
>> > >
>> > >
>> > >
>> > >
>> > > On Wed, Dec 30, 2015 at 11:45 PM, Thejas Nair <thejas.nair@gmail.com>
>> > wrote:
>> > >
>> > >> Srinivas,
>> > >> Can you please elaborate on why ODBC/JDBC is not an option ?
>> > >> I didn't understand what you meant by "unable to generate the HiveQL"
>> > >> with those options. How does using thrift api directly help in that
>> > >> case ?
>> > >>
>> > >> ODBC/JDBC is the preferred API for users. There are many features
>> > >> implemented in those layers, including the security and also high
>> > >> availability features.
>> > >> Incorrect use of the thrift api can potentially lead to other issues
>> > >> like memory leaks in HiveServer2.
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> On Wed, Dec 30, 2015 at 8:37 AM, Srinivas M <smudigonda@gmail.com>
>> > wrote:
>> > >> > Hi
>> > >> >
>> > >> > I am trying to develop a custom application using the Thrift Hive
>> > client
>> > >> > interface to access the Hive and read and write into the hive
>> > >> > tables.
>> > >> ODBC
>> > >> > and JDBC are not option because of the inherent limitations with
>> > >> > those
>> > >> > interfaces (i.e unable to generate the HiveQL etc).
>> > >> >
>> > >> > While using the Thrift Hive interface, the first challenge that
I
>> > >> > ran
>> > >> into
>> > >> > is with the authentication. I could able to connect to hive only
>> > >> > when
>> > I
>> > >> set
>> > >> > the authentication of the hiveserver2 as NOSASL.
>> > >> > My application should be able to provide Kerberos authentication
>> > >> > and
>> > all
>> > >> > other authentications supported by Hive.
>> > >> >
>> > >> > I had tried to look for some samples around implementing
>> > authentication
>> > >> in
>> > >> > the Thrift Hive client, but I could not find much details on that.
>> > >> > Can someone help me understand the options available for
>> > >> > implementing
>> > >> > authentication while using the Thrift Hive client interfaces ?
>> > >> >
>> > >> > --
>> > >> > Srinivas
>> > >> > (*-*)
>> > >> >
>> > >>
>> >
>> > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> > >> > You have to grow from the inside out. None can teach you, none
can
>> > make
>> > >> you
>> > >> > spiritual.
>> > >> >                       -Narendra Nath Dutta(Swamy Vivekananda)
>> > >> >
>> > >>
>> >
>> > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Srinivas
>> > > (*-*)
>> > >
>> >
>> > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> > > You have to grow from the inside out. None can teach you, none can
>> > > make
>> > you
>> > > spiritual.
>> > >                       -Narendra Nath Dutta(Swamy Vivekananda)
>> > >
>> >
>> > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>> >

Mime
View raw message