hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From peterramesh <ramesh.ramas...@gmail.com>
Subject Map Reduce performance
Date Tue, 23 Jun 2009 13:38:39 GMT

Hi,

I playing with a sample program using Map Reduce (MR).  All I have a text
file(685 MB), and using it to create a HTable. 

The testing environment is, 
1. single node cluster
2. 2 MB RAM 
3. Hadoop and Hbase version, both are 0.19.1

Here is the program attached, 
http://www.nabble.com/file/p24166190/MRTest.java MRTest.java 

and the hadoop-site.xml
http://www.nabble.com/file/p24166190/hadoop-site.xml hadoop-site.xml 

and fair scheduler allocation file
http://www.nabble.com/file/p24166190/mapred_fairseheduler_allocation_file.xml
mapred_fairseheduler_allocation_file.xml 
(I had used the FairScheduler, since the mapred.map.tasks were not getting
applied in the cluster instance, If I use JobQueueTaskScheduler (default),
which always run 2 tasks at a time).

On running the above program with the given configurations, it takes
(13mins, 46sec and 15mins, 3sec respectively - 2 samples) to create the
table.

If the do the same stuffs without MR, it takes 18mins, 04sec. So, the MR
gives me substantial gain. But, I would like to know, if there is better
optimization to improve the performance and also am I doing the right?

TIA,
Ramesh


-- 
View this message in context: http://www.nabble.com/Map-Reduce-performance-tp24166190p24166190.html
Sent from the HBase User mailing list archive at Nabble.com.


Mime
View raw message