hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghotham Murthy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4101) Support JDBC connections for interoperability between Hive and RDBMS
Date Mon, 03 Nov 2008 22:20:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644829#action_12644829
] 

Raghotham Murthy commented on HADOOP-4101:
------------------------------------------

Michi and I were discussing this over the weekend. Here's our current thinking about the design.
Michi, pls confirm.

1. implement a thrift client/server for hive. for now, the interface consists only of execute
and fetch_row. we were able to setup the framework with a thrift server and a java client
which talks to the server. next step is to get the server to run the queries. 
notes: we looked at the metastore code and thought it might be simpler to first implement
a separate thrift client/server before merging it with the metastore. some installations might
want to have separate instances of metastore and hive server. and, its easier to test a smaller
interface where we understand the code. also, metastore code seems to have classes which arent
being used at all and the scripts to start/stop metastore dont really work in non-facebook
installations (need to file separate jiras for those).

2. build a jdbc interface which makes calls to the generated java thrift client. we could
also have python and perl dbi interfaces which can be make calls to the generated thrift client
code in those languages. so, the thrift interface is a generic interface which is not specific
to any particular standard (jdbc/dbi etc).

3. the directory structure in the code would be as follows in src/contrib/hive. it follows
a similar model to metastore.

service/if/hive_service.thrift
service/include/<headers from thrift>
service/fb303/<scripts for service_ctrl to manage server>
service/src/gen-javabean/<generated java code>
service/src/gen-php/<generated php>
service/src/gen-py/<generated python>
service/src/gen-perl/<generated perl>
service/src/scripts/<ctrl scripts for server>
service/src/java/org/apache/hadoop/hive/service/HiveServer.java
service/src/java/org/apache/hadoop/hive/service/HiveClient.java
jdbc/src/java/org/apache/hadoop/hive/jdbc/<whatever is in current jdbc patch>
dbi/<perl dbi interface calling service/src/gen-perl>
cli/<changed to use HiveClient or HiveJdbc>

4. next steps
a. get server to run queries and return results to client.
b. move ql/Driver.java to service since the actual running of the query is not really part
of the query language.
c. change cli to use the service
d. verify which parts of the metastore interface are needed by jdbc and move/copy over parts
to hive_service - i dont think it makes sense to do it the other way around i.e. put the hive
service into metastore since metastore is not the right abstraction to actually run queries.
e. there is common thrift code in metastore and service. we should either move it to a seprate
thrift directory or make metastore use stuff from service.

It will be good to meet up to discuss them in more detail. I'll let Michi provide a patch
for the hive server/client and jdbc wrappers for the hive client.

> Support JDBC connections for interoperability between Hive and RDBMS
> --------------------------------------------------------------------
>
>                 Key: HADOOP-4101
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4101
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/hive
>            Reporter: YoungWoo Kim
>            Priority: Minor
>         Attachments: hadoop-4101.1.patch
>
>
> In many DW and BI systems, the data are stored in RDBMS for now such as oracle, mysql,
postgresql ... for reporting, charting and etc.
> It would be useful to be able to import data from RDBMS and export data to RDBMS using
JDBC connections.
> If Hive support JDBC connections, It wll be much easier to use 3rd party DW/BI tools.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message