hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "adelin.ghanayem" <fireball...@gmail.com>
Subject HBase slow data load
Date Thu, 26 Jun 2014 08:08:29 GMT
I have a problem with loading big data from mysql database into an HBase
small cluster. The cluster configurations are as follow

Machine(1): HDFS/ primary HDFS node/ Yarn resource manager/ yarn node
manager/ MapReduce / History server /zookeeper / Region Server/

Machine(2): Yarn Node Manager / Secondary HDFS node/

Machine(3): Yarn Node Manager /zookeeper / Region Server/

Machine(5): Master HBasse /zookeeper / Region Server/

Each machines parameters are 62GB RAM Intel(R) Xeon(R) CPU E5-2670 0 @
2.60GHz CPU

Loading the data is as follow: Java JDBC driver connects to MySQL databse,
then the read records are mapped to HBase row then they are inserted to
HBase. Each single record represent a single java class with about 10
primitive type fields.

THE PROBLEM : loading the data takes too much time to load, where could the
problem be ? For example : about 10 million records take about 6 hours to
load from mysql to HBase, is this normal ? Can this be improved ? What are
the possible reasons that could make loading data from mysql using java JDBC
driver into HBase that slow ?

View this message in context: http://apache-hbase.679495.n3.nabble.com/HBase-slow-data-load-tp4060750.html
Sent from the HBase User mailing list archive at Nabble.com.

View raw message