hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Zagorulkin <zagorulki...@me.com>
Subject HIVE-1555 discussion
Date Tue, 22 Nov 2016 11:16:02 GMT
Hello!

I've implemented simple solution with some hard code by now.
It's tested with oracle database.

{code:sql}
beeline> !connect jdbc:hive2://localhost:10000
Connecting to jdbc:hive2://localhost:10000
Enter username for jdbc:hive2://localhost:10000:
Enter password for jdbc:hive2://localhost:10000:
Connected to: Apache Hive (version 2.2.0-SNAPSHOT)
Driver: Hive JDBC (version 2.2.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000> SET hive.metastore.warehouse.dir=${env:HOME}/Documents/hive-warehouse;
No rows affected (0.158 seconds)
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000> CREATE EXTERNAL TABLE books3 (
. . . . . . . . . . . . . . . .>  book_id            INT,
. . . . . . . . . . . . . . . .>  book_name          STRING,
. . . . . . . . . . . . . . . .>  author_name        STRING,
. . . . . . . . . . . . . . . .>  book_isbn          STRING
. . . . . . . . . . . . . . . .> )
. . . . . . . . . . . . . . . .> STORED BY "org.apache.hive.storagehandler.JDBCStorageHandler"
. . . . . . . . . . . . . . . .> TBLPROPERTIES (
. . . . . . . . . . . . . . . .>  "mapred.jdbc.driver.class" = "oracle.jdbc.OracleDriver",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.url" = "jdbc:oracle:thin:@//localhost:49161/XE",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.username" = "*",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.password" = "*",
. . . . . . . . . . . . . . . .>  "hive.jdbc.update.on.duplicate" = "true",
. . . . . . . . . . . . . . . .>  "mapreduce.jdbc.input.table.name" = "books"
. . . . . . . . . . . . . . . .> );
No rows affected (2.297 seconds)
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000> select * from books3;
+-----------------+-------------------+---------------------+-------------------+
| books3.book_id  | books3.book_name  | books3.author_name  | books3.book_isbn  |
+-----------------+-------------------+---------------------+-------------------+
| 124123          | name              | author              | 132321adsaf31     |
| 13              | name2             | author2             | asd213fadsf       |
| 2345236         | name3             | author3             | asdfds1234123     |
+-----------------+-------------------+---------------------+-------------------+
3 rows selected (2.146 seconds)
0: jdbc:hive2://localhost:10000> explain select * from books3;
+----------------------------------------------------+
|                      Explain                       |
+----------------------------------------------------+
| STAGE DEPENDENCIES:                                |
|   Stage-0 is a root stage                          |
|                                                    |
| STAGE PLANS:                                       |
|   Stage: Stage-0                                   |
|     Fetch Operator                                 |
|       limit: -1                                    |
|       Processor Tree:                              |
|         TableScan                                  |
|           alias: books3                            |
|           Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE |
|           Select Operator                          |
|             expressions: book_id (type: string), book_name (type: string), author_name (type:
string), book_isbn (type: string) |
|             outputColumnNames: _col0, _col1, _col2, _col3 |
|             Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats: NONE
|
|             ListSink                               |
|                                                    |
+----------------------------------------------------+
17 rows selected (0.508 seconds)
{code}

This solution works with two steps:
	1. First grab all meta info from external table 
	2. Configure DBInputFormat, DBOutputFormat with table meta

What do you think about to ask user specify all needed information about columns and types
inside serde properties section?

Smth like that:

0: jdbc:hive2://localhost:10000> CREATE EXTERNAL TABLE books3 (
. . . . . . . . . . . . . . . .>  book_id            INT,
. . . . . . . . . . . . . . . .>  book_name          STRING,
. . . . . . . . . . . . . . . .>  author_name        STRING,
. . . . . . . . . . . . . . . .>  book_isbn          STRING
. . . . . . . . . . . . . . . .> )
. . . . . . . . . . . . . . . .> STORED BY “org.apache.hive.storagehandler.JDBCStorageHandler"
		WITH SERDEPROPERTIES (
					"hive.jdbc.columns.mapping" = “book_id:int(32), book_name:varchar(20), author_name:varchar(20),
book_isbn:varchar(20)")		
. . . . . . . . . . . . . . . .> TBLPROPERTIES (
. . . . . . . . . . . . . . . .>  "mapred.jdbc.driver.class" = "oracle.jdbc.OracleDriver",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.url" = "jdbc:oracle:thin:@//localhost:49161/XE",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.username" = "*",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.password" = "*",
. . . . . . . . . . . . . . . .>  "hive.jdbc.update.on.duplicate" = "true",
. . . . . . . . . . . . . . . .>  "mapreduce.jdbc.input.table.name" = "books"
. . . . . . . . . . . . . . . .> );


 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message