hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liu, Raymond" <raymond....@intel.com>
Subject RE: why my test result on dfs short circuit read is slower?
Date Sat, 16 Feb 2013 05:53:05 GMT
Hi Harsh

Yes, I did set both of these. While not in hbase-site.xml but hdfs-site.xml.

And I have double confirmed that local reads are performed, since there are no Error in datanode
logs, and by watching lo network IO.

> 
> If you want HBase to leverage the shortcircuit, the DN config
> "dfs.block.local-path-access.user" should be set to the user running HBase (i.e.
> hbase, for example), and the hbase-site.xml should have
> "dfs.client.read.shortcircuit" defined in all its RegionServers. Doing this wrong
> could result in performance penalty and some warn-logging, as local reads will
> be attempted but will begin to fail.
> 
> On Sat, Feb 16, 2013 at 8:40 AM, Liu, Raymond <raymond.liu@intel.com>
> wrote:
> > Hi
> >
> >         I tried to use short circuit read to improve my hbase cluster MR
> scan performance.
> >
> >         I have the following setting in hdfs-site.xml
> >
> >         dfs.client.read.shortcircuit set to true
> >         dfs.block.local-path-access.user set to MR job runner.
> >
> >         The cluster is 1+4 node and each data node have 16cpu/4HDD,
> with all hbase table major compact thus all data is local.
> >         I have hoped that the short circuit read will improve the
> performance.
> >
> >         While the test result is that with short circuit read enabled, the
> performance actually dropped 10-15%. Say scan a 50G table cost around 100s
> instead of 90s.
> >
> >         My hadoop version is 1.1.1, any idea on this? Thx!
> >
> > Best Regards,
> > Raymond Liu
> >
> >
> 
> 
> 
> --
> Harsh J

Mime
View raw message