hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/HiveODBC" by EricHwang
Date Fri, 07 Aug 2009 02:04:01 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by EricHwang:

New page:
== Hive ODBC Driver ==
The Hive ODBC Driver is a software library that implements the Open Database Connectivity
(ODBC) API standard for the Hive database management system, enabling ODBC compliant applications
to interact seamlessly (ideally) with Hive through a standard interface.

=== Suggested Reading ===
This guide assumes you are already familiar with the following:
 * [wiki:Hive Hive]
 * [wiki:Hive/HiveServer Hive Server]
 * [http://wiki.apache.org/thrift/ Thrift]
 * [http://msdn.microsoft.com/en-us/library/ms714177(VS.85).aspx ODBC API]
 * [http://www.unixodbc.org/ unixODBC]

=== Software Requirements ===
The following software components are needed for the successful compilation and operation
of the Hive ODBC driver:
 * '''Hive Server''' - a service through which clients may remotely issue Hive commands and
requests. The Hive ODBC driver depends on Hive Server to perform the core set of database
interactions. Hive Server is built as part of the Hive build process. More information regarding
Hive Server usage can be found [wiki:Hive/HiveServer here].
 * '''Apache Thrift''' - a scalable cross-language software framework that enables the Hive
ODBC driver (specifically the Hive client) to communicate with the Hive Server. See here for
the details on [http://wiki.apache.org/thrift/ThriftInstallation Thrift Installation]. The
Hive ODBC driver was developed with Thrift trunk version r790732, but the latest revision
should also be fine. Make sure you note the Thrift install path during the Thrift build process
as this information will be needed during the Hive client build process. The Thrift install
path will be referred to as THRIFT_HOME.

=== Driver Architecture ===
Internally, the Hive ODBC Driver contains two separate components: Hive client, and the unixODBC
API wrapper.
 * '''Hive client''' - provides a set of C-compatible library functions to interact with Hive
Server in a pattern similar to those dictated by the ODBC specification. However, Hive client
was designed to be independent of unixODBC or any ODBC specific headers, allowing it to be
used in any number of generic cases beyond ODBC.
 * '''unixODBC API wrapper''' - provides a layer on top of Hive client that directly implements
the ODBC API standard. The unixODBC API wrapper will be compiled into a shared object library,
which will be the final form of the Hive ODBC driver. This portion will remain a file attachment
on the associated JIRA until it can be checked into the unixODBC code repository: [https://issues.apache.org/jira/browse/HIVE-187

NOTE: Hive client needs to be built and installed before the unixODBC API wrapper can compile

==== Hive Client Build/Setup ====
In order to build the Hive client:
 1. Checkout and setup the latest version of Apache Hive. For more details, see [wiki:Hive/GettingStarted
Getting Started with Hive]. From this point onwards, the path to the Hive root directory will
be referred to as HIVE_HOME.
 1. Build the Hive client by running the following command from HIVE_HOME. This will compile
and copy the libraries and header files to {{{HIVE_HOME/build/odbc/}}}. Please keep in mind
that all paths should be fully specified (no relative paths).
 $ ant compile-cpp -Dthrift.home=<THRIFT_HOME>
 You can optionally force Hive client to compile into a non-native bit architecture by specifying
the additional parameter (assuming you have the proper compilation libraries):
 $ ant compile-cpp -Dthrift.home=<THRIFT_HOME> -Dword.size=<32 or 64>
 You can verify the compilation by running the Hive client test suite. You can specifically
execute the Hive client tests by running the following command from {{{HIVE_HOME/odbc/}}}.
NOTE: Hive client tests require that a local Hive Server be operating on port 10000.
 $ ant test
 1.#3 To install the Hive client libraries onto your machine, run the following command from
{{{HIVE_HOME/odbc/}}}. NOTE: The install path defaults to {{{/usr/local}}}, but this can be
changed by setting the {{{INSTALL_PATH}}} environment variable to a desired alternative.
 $ ant install -Dthrift.home=<THRIFT_HOME>

==== unixODBC API Wrapper Build/Setup ====
After you have built and installed the Hive client, you can now install the unixODBC API wrapper:
 1. In the unixODBC root directory, run the following command:
 $ ./configure --enable-gui=no --prefix=<unixODBC_INSTALL_DIR>
 If you encounter the the errors: "{{{redefinition of 'struct _hist_entry'}}}" or "{{{previous
declaration of 'add_history' was here}}}" then re-execute the configure with the following
 $ ./configure --enable-gui=no --enable-readline=no --prefix=<unixODBC_INSTALL_DIR>
 1.#2 Compile the unixODBC API wrapper with the following:
 $ make
 To force the compilation of the unixODBC API wrapper into a non-native bit architecture,
modify the CC and CXX environment variables to include the appropriate flags. For example:
 $ CC="gcc -m32" CXX="g++ -m32" make
 1.#3 If you want to completely install unixODBC and all related drivers:
  a. Run the following from the unixODBC root directory:
  $ make install
  a.#2 If your system complains about {{{undefined symbols}}} during unixODBC testing (such
as with {{{isql}}} or {{{odbcinst}}}) after installation, try running {{{ldconfig}}} to update
your library catalog.
 1.#4 If you only want to obtain the Hive ODBC driver shared object library:
  a. After compilation, the driver will be located at {{{<unixODBC_BUILD_DIR>/Drivers/hive/.libs/libodbchive.so.1.0.0}}}.
  a. This may be copied to any other location as desired. Keep in mind that the Hive ODBC
driver has a dependency on the Hive client shared object library: {{{libhiveclient.so}}}.
  a. You can manually install the unixODBC API wrapper by doing the following:
  $ cp <unixODBC_BUILD_DIR>/Drivers/hive/.libs/libodbchive.so.1.0.0 <SYSTEM_INSTALL_DIR>
  $ ln -s libodbchive.so.1.0.0 libodbchive.so
  $ ldconfig

=== Connecting the Driver to a Driver Manager ===
This portion assumes that you have already built and installed both the Hive client and the
unixODBC API wrapper shared libraries on the current machine. To connect the Hive ODBC driver
to a previously installed Driver Manager (such as the one provided by unixODBC or a separate
 1. Locate the odbc.ini file associated with the Driver Manager (DM):
  a. If you are installing the driver on the system DM, then you can run the following command
to print the locations of DM configuration files.
  $ odbcinst -j
  unixODBC 2.2.14
  DRIVERS............: /usr/local/etc/odbcinst.ini
  SYSTEM DATA SOURCES: /usr/local/etc/odbc.ini
  FILE DATA SOURCES..: /usr/local/etc/ODBCDataSources
  USER DATA SOURCES..: /home/ehwang/.odbc.ini
  SQLULEN Size.......: 8
  SQLLEN Size........: 8
  a.#2 If you are installing the driver on an application DM, then you have to help yourself
on this one ;). Hint: try looking in the installation directory of your application.
   i. Keep in mind that an application's DM can exist simultaneously with the system DM and
will likely use its own configuration files, such as odbc.ini.
   i. Also, note that some applications do not have their own DMs and simply use the system
 1. Add the following ini configuration entry to the DM's corresponding odbc.ini:
 Driver = <path_to_libodbchive.so>
 Description = Hive Driver v1
 DATABASE = default
 HOST = <Hive_server_address>
 PORT = <Hive_server_port>

=== Testing with ISQL ===
Once you have installed the necessary Hive ODBC libraries and added a Hive entry in your system's
default odbc.ini, you will be able to interactively test the driver with isql:
$ isql -v Hive
If your system does not have isql, you can obtain it by installing the entirety of unixODBC.

=== Current Status ===
 * Limitations:
  * No support for Unicode
  * Not thread safe
  * No support for asynchronous execution of queries
  * Does not check for memory allocation errors
 * ODBC API Function Support:
  * SQLAllocConnect - supported
  * SQLAllocEnv   - supported
  * SQLAllocHandle - supported
  * SQLAllocStmt  - supported
  * SQLBindCol  - supported
  * SQLBindParameter - NOT supported
  * SQLCancel - NOT supported
  * SQLColAttribute – supported
  * SQLColumns – supported
  * SQLConnect - supported
  * SQLDescribeCol – supported
  * SQLDescribeParam – NOT supported
  * SQLDisconnect - supported
  * SQLDriverConnect - supported
  * SQLError – supported
  * SQLExecDirect - supported
  * SQLExecute - supported
  * SQLExtendedFetch – NOT supported
  * SQLFetch - supported
  * SQLFetchScroll – NOT supported
  * SQLFreeConnect - supported
  * SQLFreeEnv - supported
  * SQLFreeHandle - supported
  * SQLFreeStmt - supported
  * SQLGetConnectAttr – NOT supported
  * SQLGetData – supported (however, SQLSTATE not returning values)
  * SQLGetDiagField – NOT supported
  * SQLGetDiagRec - supported
  * SQLGetInfo – NOT supported; a limited version may be provided
  * SQLMoreResults – NOT supported
  * SQLNumParams – NOT supported
  * SQLNumResultCols - supported
  * SQLParamOptions – NOT supported
  * SQLPrepare – supported; but does not permit parameter markers
  * SQLRowCount – NOT supported
  * SQLSetConnectAttr – NOT supported
  * SQLSetConnectOption – NOT supported
  * SQLSetEnvAttr – Limited support
  * SQLSetStmtAttr – NOT supported
  * SQLSetStmtOption – NOT supported
  * SQLTables – supported
  * SQLTransact – NOT supported

View raw message