hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/AdminManual/MetastoreAdmin" by PrasadChakka
Date Thu, 22 Jan 2009 01:35:19 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by PrasadChakka:
http://wiki.apache.org/hadoop/Hive/AdminManual/MetastoreAdmin

New page:
== Hive Metastore ==

=== Introduction ===

All the metadata for Hive tables and partitions are stored in Hive Metastore. Metadata is
persisted using [http://www.datanucleus.org/ JPOX] ORM solution so any store that is supported
by it. Most of the commercial relational databases and a many open source datstores are supported.
Any datastore that has JDBC driver can probably used.

There are 3 different ways to setup metastore server using different Hive configurations.
The relevant configuration parameters are

|| Config Param || Description ||
|| javax.jdo.option.ConnectionURL || JDBC connection string for the data store which contains
metadata ||
|| javax.jdo.option.ConnectionDriverName || JDBC Driver class name for the data store which
contains metadata ||
|| hive.metastore.uris || Hive connects to this URI to make metadata requests for a remote
metastore ||
|| hive.metastore.local || local or remote metastore ||
|| hive.metastore.warehouse.dir || URI of the default location for native tables ||

Default configuration sets up an embedded metastore which is used in unit tests and is described
in the next section. More practical options are described in the subsequent sections.

=== Embedded Metastore ===
Mainly used for unit tests and only one process can connect to metastore at a time. So it
is not really a practical solution but works well for unit tests.
|| Config Param || Config Value || Comment ||
|| javax.jdo.option.ConnectionURL || jdbc:derby:;databaseName=../build/test/junit_metastore_db;create=true
|| derby database located at hive/trunk/build... ||
|| javax.jdo.option.ConnectionDriverName || org.apache.derby.jdbc.EmbeddedDriver || Derby
embeded JDBC driver class||
|| hive.metastore.uris |||| not needed since this is a local metastore ||
|| hive.metastore.local || true || embeded is local ||
|| hive.metastore.warehouse.dir || file://${user.dir}/../build/ql/test/data/warehouse || unit
test data goes in here ||

If you want to run the metastore as a network server so it can be accessed from multiple nodes
try HiveDerbyServerMode.

=== Local Metastore ===
In local metastore setup, each Hive Client will open a connection to the datastore and make
SQL queries against it. The following config will setup a metastore in a MySQL server. Make
sure that the server accessible from the machines where Hive queries are executed since this
is a local store.

|| Config Param || Config Value || Comment ||
|| javax.jdo.option.ConnectionURL || jdbc:mysql://<host name>/<database name>?createDatabaseIfNotExist=true
|| metadata is stored in a MySQL server ||
|| javax.jdo.option.ConnectionDriverName || com.mysql.jdbc.Driver || MySQL JDBC driver class
||
|| javax.jdo.option.ConnectionUserName || <user name> || user name for connecting to
mysql server ||
|| javax.jdo.option.ConnectionPassword || <password> || password for connecting to mysql
server ||
|| hive.metastore.uris |||| not needed because this is local store||
|| hive.metastore.local || true || this is local store ||
|| hive.metastore.warehouse.dir || <base hdfs path> || default location for Hive tables.
||

=== Remote Metastore ===
In remote metastore setup, all Hive Clients will make a connection a metastore server which
in turn queries the datastore (MySQL in this example) for metadata. Metastore server and client
communicate using [http://incubator.apache.org/thrift Thrift] Protocol. You can start a thrift
server by executing the following command

{{{
$JAVA_HOME/bin/java  -Xmx1024m -Dlog4j.configuration=file://$HIVE_HOME/conf/hms-log4j.properties
-Djava.library.path=$HADOOP_HOME/lib/native/Linux-amd64-64/ -cp $CLASSPATH org.apache.hadoop.hive.metastore.HiveMetaStore
}}}

JAVA_HOME, HIVE_HOME, HADOOP_HOME should be correctly set. CLASSPATH should contain Hadoop,
Hive (lib and auxlib), and Java jars.

|| Config Param || Config Value || Comment ||
|| javax.jdo.option.ConnectionURL || jdbc:mysql://<host name>/<database name>?createDatabaseIfNotExist=true
|| metadata is stored in a MySQL server ||
|| javax.jdo.option.ConnectionDriverName || com.mysql.jdbc.Driver || MySQL JDBC driver class
||
|| javax.jdo.option.ConnectionUserName || <user name> || user name for connecting to
mysql server ||
|| javax.jdo.option.ConnectionPassword || <password> || password for connecting to mysql
server ||
|| hive.metastore.uris || thrift://hadoop5002.snc1.facebook.com:9083 || host and port for
the thrift metastore server ||
|| hive.metastore.local || true || this is local store ||
|| hive.metastore.warehouse.dir || <base hdfs path> || default location for Hive tables.
||

If you are using MySQL as the datastore for metadata, put MySQL client libraries in HIVE_HOME/lib
before starting Hive Client or HiveMetastore Server.

Mime
View raw message