hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lars George <lars.geo...@gmail.com>
Subject Re: Coprocessor experiments
Date Fri, 27 May 2011 05:58:34 GMT
Awesome Himanshu,

I was also trying to test using CPs and see where the sweetspot is
between number of threads to process in parallel, and overloading the
servers since you potentially send a heavy resource bound task to
already taxed servers and therefore taking a huge hit everywhere. I
was thinking of running a YCSB in parallel with mainly reads and then
compare the impact if I do a 1) linear, 2) MR based, and 3) CP based
full table scan.


On Fri, May 27, 2011 at 3:40 AM, Himanshu Vashishtha
<hvashish@cs.ualberta.ca> wrote:
> I did some experiments using coprocessors and compare the result with
> vanilla scan, and in one case with mapreduce. I wrote up a blog about these
> experiments as it was getting a bit difficult for me to explain it on mail,
> without figures etc. Please refer to
> http://hbase-coprocessor-experiments.blogspot.com/2011/05/extending.html
> The result seems to suggest the coprocessor endpoints are a useful feature
> when one need to access a larger number of rows (well I can't quantify it as
> of now) and generating some sparse results. The main advantage is that the
> processing is done in parallel (region level granularity) and it can be
> extended to come up with a parallel scanner functionality.
> Interestingly, the single result coprocessor endpoints (aka the existing
> one) fails when I increased the table data. I tried to do a row count on a
> 100m rows. I need to dig more into it, but have mentioned my initial
> thoughts in the blog.
> I want to test them more rigorously and will really appreciate your feedback
> on the experiments. I have been on it for a while now, therefore need new
> pair of eyes to do some review.
> Thanks a lot for your time.
> Cheers,
> Himanshu

View raw message