hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "HiveDerbyServerMode" by EdwardCapriolo
Date Thu, 02 Oct 2008 22:43:53 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by EdwardCapriolo:
http://wiki.apache.org/hadoop/HiveDerbyServerMode

New page:
== Hive using Derby in Server Mode ==

Hive in embedded mode has a limitation of 1 user. You may want to run Derby as a Network Server
in this way multiple users can access it simultaneously from different systems.

=== Download Derby ===
It is suggested you download the version of derby that ships with hive. If you have already
ran Hive in embedded mode the first line of derby.log contains the version. 

My structure looks like this:
{{{
/opt/hadoop/hadoop-0.17.2.1
/opt/hadoop/db-derby-10.4.1.3-bin
/opt/hadoop/hive
}}}

{{{
cd /opt/hadoop
<download>
tar -xzf db-derby-10.4.1.3-bin.tar.gz
mkdir db-derby-10.4.1.3-bin/data
}}}

=== Set environment ===
What variable to set has changed over the years. DERBY_HOME is now the proper name. I set
this and the legacy name.

/etc/profile.d/derby.sh
{{{
DERBY_INSTALL=/opt/hadoop/db-derby-10.4.1.3-bin
DERBY_HOME=/opt/hadoop/db-derby-10.4.1.3-bin
export DERBY_INSTALL
export DERBY_HOME
}}}

Hive also likes to know where hadoop is installed

/etc/profile.d/hive.sh
{{{
HADOOP=/opt/hadoop/hadoop-0.17.2.1/bin/hadoop
export HADOOP
}}}

=== Starting Derby ===
Likely you are going to want to run derby when hadoop starts up. An interesting place for
this other then as an lsb-init-script might be alongside hadoop scripts like start-dfs. By
default derby will create databases in the directory it was started from.
{{{
cd /opt/hadoop/db-derby-10.4.1.3-bin/data
nohup /opt/hadoop/db-derby-10.4.1.3-bin/startNetworkServer -h 0.0.0.0 &
}}}

=== Configuring Hive to use Network Derby ===

/opt/hadoop/hive/conf/hive-site.xml
{{{
<property>
  <name>hive.metastore.local</name>
  <value>true</value>
  <description>controls whether to connect to remove metastore server or open a new
metastore server in Hive Client JVM</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:derby://hadoop1:1527/metastore_db;create=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>org.apache.derby.jdbc.ClientDriver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>
}}}

/opt/hadoop/hive/conf/jpox.properties
{{{
javax.jdo.PersistenceManagerFactoryClass=org.jpox.PersistenceManagerFactoryImpl
org.jpox.autoCreateSchema=false
org.jpox.validateTables=false
org.jpox.validateColumns=false
org.jpox.validateConstraints=false
org.jpox.storeManagerType=rdbms
org.jpox.autoCreateSchema=true
org.jpox.autoStartMechanismMode=checked
org.jpox.transactionIsolation=read_committed
javax.jdo.option.DetachAllOnCommit=true
javax.jdo.option.NontransactionalRead=true
javax.jdo.option.ConnectionDriverName=org.apache.derby.jdbc.ClientDriver
javax.jdo.option.ConnectionURL=jdbc:derby://hadoop1:1527/metastore_db;create=true
javax.jdo.option.ConnectionUserName=APP
javax.jdo.option.ConnectionPassword=mine
}}}

=== Copy Derby Jar Files ===

Now since there is a new client you MUST make sure hive has these in the lib directory or
in the classpath. The same would be true if you used MySQL or some other DB.

{{{
cp /opt/hadoop/db-derby-10.4.1.3-bin/lib/derbyclient.jar /opt/hadoop/hive/lib
cp /opt/hadoop/db-derby-10.4.1.3-bin/lib/derbytools.jar /opt/hadoop/hive/lib
}}}

=== Start Up Hive ===

The metastore will not be created until the first query hits it.
{{{
cd /opt/hadoop/hive
bin/hive
hive> show tables;
}}}

A directory should be created /opt/hadoop/db-derby-10.4.1.3-bin/data/metastore_db

=== The Result ===
Now you can run multiple hive instances working on the same data simultaneously and remotely.

Mime
View raw message