hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: why my test result on dfs short circuit read is slower?
Date Sat, 16 Feb 2013 05:44:28 GMT
If you want HBase to leverage the shortcircuit, the DN config
"dfs.block.local-path-access.user" should be set to the user running
HBase (i.e. hbase, for example), and the hbase-site.xml should have
"dfs.client.read.shortcircuit" defined in all its RegionServers. Doing
this wrong could result in performance penalty and some warn-logging,
as local reads will be attempted but will begin to fail.

On Sat, Feb 16, 2013 at 8:40 AM, Liu, Raymond <raymond.liu@intel.com> wrote:
> Hi
>         I tried to use short circuit read to improve my hbase cluster MR scan performance.
>         I have the following setting in hdfs-site.xml
>         dfs.client.read.shortcircuit set to true
>         dfs.block.local-path-access.user set to MR job runner.
>         The cluster is 1+4 node and each data node have 16cpu/4HDD, with all hbase table
major compact thus all data is local.
>         I have hoped that the short circuit read will improve the performance.
>         While the test result is that with short circuit read enabled, the performance
actually dropped 10-15%. Say scan a 50G table cost around 100s instead of 90s.
>         My hadoop version is 1.1.1, any idea on this? Thx!
> Best Regards,
> Raymond Liu

Harsh J

View raw message