hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Duxbury <br...@rapleaf.com>
Subject Re: Slow mapreduce using Hbase , regardless on number of machines
Date Wed, 09 Jul 2008 16:13:27 GMT
How many regions are there in your table? If your 200k regions fits  
inside a single region, adding more region servers isn't going to  
make anything faster because only one server will be participating.

-Bryan

On Jul 9, 2008, at 7:36 AM, yair even-zohar wrote:

> I am testing HBase 0.1.2 and am getting the following performance  
> using RowCounter class (I had to modify the main() method of the  
> original class because it contains some hardcoded  parameters :-)
>
> Single regionserver  - counting 200,000 lines in 60 or 61 seconds
> 5 regieonservers - counting 200,000 lines in 55 or 58 seconds
>
> Clearly, one expects better performance, so I assume I'm doing  
> something wrong. By the way, I'm getting about the same performance  
> when I'm iterating through a scanner without the mapreduce.
>
> Here is my hadoop-site.xml
>
> <configuration>
>   <property>
>     <name>fs.default.name</name>
>     <value>hdfs://sb-centercluster01:9100</value>
>   </property>
>   <property>
>     <name>mapred.job.tracker</name>
>     <value>hdfs://sb-centercluster01:9101</value>
>   </property>
>   <property>
>     <name>mapred.map.tasks</name>
>     <value>13</value>
>   </property>
>   <property>
>     <name>mapred.reduce.tasks</name>
>     <value>5</value>
>   </property>
>   <property>
>     <name>dfs.replication</name>
>     <value>3</value>
>   </property>
>   <property>
>     <name>dfs.name.dir</name>
>     <value>/home/hadoop/dfs16,/tmp/hadoop/dfs16</value>
>   </property>
>   <property>
>     <name>dfs.data.dir</name>
>     <value>/state/partition1/hadoop/dfs16</value>
>   </property>
> </configuration>
>
> Increasing "io.bytes.per.checksum" and "io.file.buffer.size" didn't  
> help. Neither decreasing "dfs.replication"
>
> Here is my hbase-site.xml
>
> <configuration>
> <property>
>     <name>hbase.master</name>
>     <value>sb-centercluster01:60002</value>
>     <description>The host and port that the HBase master runs at.
>     </description>
>   </property>
>   <property>
>     <name>hbase.rootdir</name>
>     <value>hdfs://sb-centercluster01:9100/hbase</value>
>     <description>The directory shared by region servers.
>     </description>
>   </property>
>   <property>
>     <name>hbase.io.index.interval</name>
>     <value>8</value>
>   </property>
> </configuration>
>
>
> Any help will be appreciated.
>
> Thanks
> -Yair
>
>
>


Mime
View raw message