hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Asaf Mesika <asaf.mes...@gmail.com>
Subject Re: HBase Thrift inserts bottlenecked somewhere -- but where?
Date Sat, 02 Mar 2013 19:56:42 GMT
Make sure you are not sending a lot of put of the same rowkey. This can
cause contention in the region server side. We fixed that in our project by
aggregating all the columns for the same rowkey into the same Put object
thus when sending List of Put we made sure each Put has a unique rowkey.

On Saturday, March 2, 2013, Dan Crosta wrote:

> On Mar 2, 2013, at 12:38 PM, lars hofhansl wrote:
> > "That's only true from the HDFS perspective, right? Any given region is
> > "owned" by 1 of the 6 regionservers at any given time, and writes are
> > buffered to memory before being persisted to HDFS, right?"
> >
> > Only if you disabled the WAL, otherwise each change is written to the
> WAL first, and then committed to the memstore.
> > So in the sense it's even worse. Each edit is written twice to the FS,
> replicated 3 times, and all that only 6 data nodes.
> Are these writes synchronized somehow? Could there be a locking problem
> somewhere that wouldn't show up as utilization of disk or cpu?
> What is the upshot of disabling WAL -- I assume it means that if a
> RegionServer crashes, you lose any writes that it has in memory but not
> committed to HFiles?
> > 20k writes does seem a bit low.
> I adjusted dfs.datanode.handler.count from 3 to 10 and now we're up to
> about 22-23k writes per second, but still no apparent contention for any of
> the basic system resources.
> Any other suggestions on things to try?
> Thanks,
> - Dan

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message