hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Himanshu Vashishtha <hvash...@cs.ualberta.ca>
Subject Re: YCSB tests for HBase on Whirr (was: Report to Apache board: first cut)
Date Fri, 21 Jan 2011 22:43:17 GMT
hello Mingjie,
this comes at a very apt time for me. I will be evaluating hbase on ec2
using ycsb, and will run mapreduce jobs over there. Like for instance, I
will evaluate some simple agg ones (1512), with mapreduce jobs, coprocessor
and pure HBase APIs (like Scan + client side processing).

I have things running on local, and will move to ec2 pretty soon (by today).
Right now, zero experience with setting hbase on  ec2. I may be bugging you
guys in case I get stuck. :)


On Fri, Jan 21, 2011 at 1:40 PM, Mingjie Lai <mingjie_lai@trendmicro.com>wrote:

> Guys.
> There is a discussion regarding testing HBASE with YCSB on Whirr or EC2.
> Send to @dev so more people can be involved.
> Lars.
> I have an automatic YCSB test for HBase running on EC2. It was derived from
> Andy and Eugene's HBase EC2 script. What I added include:
> - YCSB test support
> - build and upload new HBase jar triggered by SCM(git) changes
> - email YCSB test results to configured recipients
> - automatically running as a daily cron job
> You can take a look at: https://github.com/mlai/hbase-ec2/tree/ycsb for
> more detail.
> We do want to move the script to support Whirr, but right now we're lack of
> resources to do the job. Also It seems there is a Whirr HBase bug reported
> although I haven't exactly checked the detail. So there is no further
> progress toward Whirr support right now.
> >> Reporting back the results will be a bit more challenging as usually
> >> you spin down the cluster at end.
> I was also bothered a lot for what could be best way to present the result
> from an automatic test. I picked the simplest way -- sending result by
> emails, so that I can avoid the problem to save the data to somewhere.
> But it could be extended to support Hudson. Right now it downloads the
> result files locally after YCSB tests finished, and parse the result locally
> where I grab the detail of results as email contents. I think hudson can use
> the same files to present results.
> >> And we do
> >> not want to keep the cluster running unnecessarily for a build in web
> >> interface to browse the results etc.
> Totally agree, we want to terminate the cluster as soon as the test
> finished.
> Here is an example of a test result:
> http://pastebin.com/f08bRCkY
> What do you think, Lars?
> Thanks,
> Mingjie
> -------- Original Message --------
> Subject:        Re: Report to Apache board: first cut
> Date:   Fri, 21 Jan 2011 09:46:46 -0800
> From:   Stack <stack@duboce.net>
> +1 to Todd suggestion (and change subject -- smile)
> St.Ack
> On Fri, Jan 21, 2011 at 8:19 AM, Todd Lipcon<todd@cloudera.com>  wrote:
>>  Should we move this discussion to the dev list at large?
>>  Our QA team is also starting to look at at least smoke testing HBase on a
>>  cluster. We should coordinate efforts!
>>  On Fri, Jan 21, 2011 at 12:56 AM, Lars George<lars.george@gmail.com>
>>  wrote:
>>   Hi Andy,
>>>  I assumed as much from our previous conversations. I send Eugene the
>>>  details on Whirr and using HBase with it. Unfortunately currently
>>>  JClouds can not yet ship the scripts from the local directory, but
>>>  that is coming soon. In the meantime we need to use a "public" S3
>>>  based repo that has a copy. He had that set up last time we got HBase
>>>  running together using Whirr. I think he is pretty much set, we simply
>>>  need to add a specific "test" role that allows us to start the cluster
>>>  and when "test" is part of the template we can not only start the
>>>  cluster but invoke whatever test we need. In effect we could have
>>>  "test-ycsb-basic", "test-ycsb-workload-5050", "test-mvn-test" (for the
>>>  build in tests) and so on to start this. That has the advantage of
>>>  being able to use various templates to test different cluster setups
>>>  against equally different test scenarios.
>>>  Reporting back the results will be a bit more challenging as usually
>>>  you spin down the cluster at end. We could grab whatever the test
>>>  results are and upload them back to an S3 repo or so? I am not sure if
>>>  there is a common interface for that which would make sense given
>>>  YCSB! and the Surefire reports are different end results. And we do
>>>  not want to keep the cluster running unnecessarily for a build in web
>>>  interface to browse the results etc. Nice would be some Hudson
>>>  integration which would spin up clusters and then retain the test
>>>  results? Sorry for not having a clear idea here, though I assume you
>>>  already have a much better plan, so just throwing it out there.
>>>  If this makes sense I could also add those tests into the Whirr HBase
>>>  service itself so that it gets shipped with Whirr for everyone to
>>>  execute. That way the test scripts would evolve with the project.
>>>  Eugene and Mingjie, what is your take on this? Looking forward hearing
>>> from
>>>  you.
>>>  Regards,
>>>  Lars
>>>  On Fri, Jan 21, 2011 at 1:35 AM, Andrew Purtell<apurtell@apache.org>
>>>  wrote:
>>>  >  I've talked with our guys about doing exactly this Lars.
>>>  >
>>>  >  Best regards,
>>>  >
>>>  >      - Andy
>>>  >
>>>  >  Problems worthy of attack prove their worth by hitting back.
>>>  >    - Piet Hein (via Tom White)
>>>  >
>>>  >
>>>  >  --- On Tue, 1/18/11, Lars George<lars.george@gmail.com>  wrote:
>>>  >
>>>  >>  From: Lars George<lars.george@gmail.com>
>>>  >>  Subject: Re: Report to Apache board: first cut
>>>  >>  To: private@hbase.apache.org
>>>  >>  Date: Tuesday, January 18, 2011, 12:23 PM
>>>  >>  I would love to chime in and help but
>>>  >>  am in Israel on a customer stint
>>>  >>  working 12 hour days.
>>>  >>
>>>  >>  My plan is to use Whirr and a custom init script to automate testing
>>>  >>  of HBase on a dynamic, on-demand cluster. I need good tests though
>>>  >>  besides the junit ones. I would love to run something more useful,
>>>  >>  could be YCSB! or some such. Could you send me what you are usually
>>>  >>  using so I could all put this together so that others can do burn
>>> ins
>>>  >>  as well?
>>>  >>
>>>  >>  Thanks,
>>>  >>  Lars
>>>  >
>>>  >
>>>  >
>>>  >
>>>  >
>>  --
>>  Todd Lipcon
>>  Software Engineer, Cloudera
> The information contained in this email and any attachments is confidential
> and may be subject to copyright or other intellectual property protection.
> If you are not the intended recipient, you are not authorized to use or
> disclose this information, and we request that you notify us by reply mail
> or telephone and delete the original message from your mail system.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message