hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Performances Tests
Date Thu, 21 Mar 2013 13:12:47 GMT
Your welcome Enis.

Here is the update file with Trunk results added
http://www.spaggiari.org/media/blogs/hbase/pictures/performances_20130321.pdf

I have also added the Y scale label as asked. It's now all in
rows/seconds or rows/minutes when it's too slow.

The todo is now:
- Re-download all HBase version to make sure PE is using the right one (WIP);
- Re-run the scanRange100 tests to validate the values already found (Next);
- Add 0.95;
- Add LoadTestTool;
- Add HFilePerformanceEvaluation.


2013/3/20 Enis Söztutar <enis.soz@gmail.com>:
> Thanks so much for doing this J-M.
>
> Enis
>
>
> On Wed, Mar 20, 2013 at 11:44 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
>> Hi Ted,
>>
>> I will try to build the trunk version and add it in the list....
>>
>> So I have to:
>> - Re-download all HBase version to make sure PE is using the right one;
>> - Re-run the scanRange100 tests to validate the values already found;
>> - Add the trunk;
>> - Add LoadTestTool
>>
>> Than will keep my free time buzy ;)
>>
>> I keep you all posted as soon as it's done.
>>
>> JM
>>
>> 2013/3/20 Ted Yu <yuzhihong@gmail.com>:
>> > I am curious to know how trunk stands in the performance comparison.
>> > There have been many optimizations going into trunk. Getting hold of
>> > overall improvement would be nice.
>> >
>> > Cheers
>> >
>> > On Wed, Mar 20, 2013 at 5:02 AM, Jean-Marc Spaggiari <
>> > jean-marc@spaggiari.org> wrote:
>> >
>> >> Hi Lars,
>> >>
>> >> Can you share the code you are using so I can compate with PE? Also, I
>> >> will re-run all for my scanRange100 tests today and update the
>> >> spreadsheet again to make sure it's correct. Also also re-download all
>> >> the HBase versions to make sure they are all clean. I'm not doing any
>> >> configuration with them. Simply reducing the logs and tmp pointing to
>> >> memory file system.
>> >>
>> >> I will keep you posted when it's done.
>> >>
>> >> Hi Jonathan,
>> >>
>> >> It's usually rows per seconds, but with a factor 10. Sometime I had to
>> >> divide by 100000, sometime to multiply to get numbers bigger... I will
>> >> take a look at th formulas and add the legend for each of the charts.
>> >>
>> >> JM
>> >>
>> >> 2013/3/19 Jonathan Hsieh <jon@cloudera.com>:
>> >> > What is the y axis's unit?  seconds or operations per second etc?
>>  (nit:
>> >> > would be nice to have on the axis.. )
>> >> >
>> >> > Based on the context, I believe it is ops/s.
>> >> >
>> >> > Jon.
>> >> >
>> >> > On Sat, Mar 16, 2013 at 7:03 PM, Jean-Marc Spaggiari <
>> >> > jean-marc@spaggiari.org> wrote:
>> >> >
>> >> >> Hi Enis,
>> >> >>
>> >> >> "interesting" in the positive way ;)
>> >> >>
>> >> >> Results are there:
>> >> >>
>> >> >>
>> >>
>> http://www.spaggiari.org/media/blogs/hbase/pictures/performances-1.pdf?mtime=1363484477
>> >> >>
>> >> >> The improvment on scan are impressive. sequentialRead and randomScan
>> >> went
>> >> >> down.
>> >> >>
>> >> >> In ran the 0.94.6 tests with RC2. If we have a RC3 I will rerun
them.
>> >> >>
>> >> >> I will add HFilePerformanceEvaluation soon but I'm facinf some
issues
>> >> >> with it on previous HBase version...
>> >> >>
>> >> >> JM
>> >> >>
>> >> >> 2013/3/12 Enis Söztutar <enis.soz@gmail.com>:
>> >> >> >> I just finished to run all the PerformanceEvaluation tests
on a
>> >> >> > dedicated computer with all 0.9x.x HBase versions, and I found
>> results
>> >> >> > interesting.
>> >> >> > Can you please provide your numbers if you can. What is interesting
>> >> from
>> >> >> > your findings?
>> >> >> >
>> >> >> > Enis
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Tue, Mar 12, 2013 at 5:41 PM, Jean-Marc Spaggiari <
>> >> >> > jean-marc@spaggiari.org> wrote:
>> >> >> >
>> >> >> >> If you run only 1 client with PerformanceEvaluation, it's
not
>> running
>> >> >> >> it over MapReduce, so you don't have this overhead. But
you can
>> still
>> >> >> >> run it if you want to have something more distributed.
Might be
>> >> useful
>> >> >> >> to have the 2 options. But at the end, LoadTestTool or
>> >> >> >> PerformanceEvaluation, any of the 2 is good as long as
we are
>> adding
>> >> >> >> those tests.
>> >> >> >>
>> >> >> >> I just finished to run all the PerformanceEvaluation tests
on a
>> >> >> >> dedicated computer with all 0.9x.x HBase versions, and
I found
>> >> results
>> >> >> >> interesting. That gives us a good baseline to see if new
HBase
>> >> >> >> improvements are really improving performances.
>> >> >> >>
>> >> >> >> JM
>> >> >> >>
>> >> >> >> 2013/3/8 Andrew Purtell <apurtell@apache.org>:
>> >> >> >> > Tangentally: I think I prefer LoadTestTool over
>> >> >> PerformanceEvaluation, it
>> >> >> >> > doesn't depend on nor is influenced by MapReduce
job startup.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On Fri, Mar 8, 2013 at 10:05 PM, ramkrishna vasudevan
<
>> >> >> >> > ramkrishna.s.vasudevan@gmail.com> wrote:
>> >> >> >> >
>> >> >> >> >> @JM
>> >> >> >> >> I agree with you.  Mainly the perf improvement
changes needs
>> some
>> >> >> >> >> testcases.
>> >> >> >> >> But sometimes the scenario on which the perf
improvments
>> happens
>> >> are
>> >> >> bit
>> >> >> >> >> difficult to generate and we will be able to
do in a standalone
>> >> case
>> >> >> >> only.
>> >> >> >> >>  May be overall if we need to get that perf improvment
result
>> we
>> >> >> need a
>> >> >> >> >> real cluster with suitable data.  That is what
i have
>> experienced.
>> >> >>  Just
>> >> >> >> >> telling.
>> >> >> >> >>
>> >> >> >> >> Regards
>> >> >> >> >> Ram
>> >> >> >> >>
>> >> >> >> >> On Fri, Mar 8, 2013 at 7:28 PM, Jean-Marc Spaggiari
<
>> >> >> >> >> jean-marc@spaggiari.org
>> >> >> >> >> > wrote:
>> >> >> >> >>
>> >> >> >> >> > Hi,
>> >> >> >> >> >
>> >> >> >> >> > In HBase we already have PerformanceEvaluation
which gives
>> us a
>> >> >> good
>> >> >> >> >> > way to validate that nothing broke HBase
speed in the recent
>> >> >> updates.
>> >> >> >> >> >
>> >> >> >> >> > I can see in the JIRAs many improvements
coming, like for the
>> >> lazy
>> >> >> >> >> > seeks, the bloom filters, etc. however,
there is no tests for
>> >> those
>> >> >> >> >> > improvements.
>> >> >> >> >> >
>> >> >> >> >> > Will it not be good to ask people to add
some new tests in
>> >> >> >> >> > PerformanceEvaluation when they are introducing
an
>> improvement
>> >> >> which
>> >> >> >> >> > is not covered there?
>> >> >> >> >> >
>> >> >> >> >> > We should not touch existing tests because
we need to have a
>> >> way to
>> >> >> >> >> > compare the baseline between the different
versions, but we
>> can
>> >> >> still
>> >> >> >> >> > add some new. Like in addition to RandomSeekScanTest
we can
>> add
>> >> >> >> >> > RandomSeekScanBloomEnabledTest and so on.
And even better if
>> we
>> >> can
>> >> >> >> >> > back port those new tests to previous version.
>> >> >> >> >> >
>> >> >> >> >> > The same way we add a test class when we
introduce a new
>> >> feature,
>> >> >> >> >> > should we add a performance test method
to test it too?
>> >> >> >> >> >
>> >> >> >> >> > JM
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > Best regards,
>> >> >> >> >
>> >> >> >> >    - Andy
>> >> >> >> >
>> >> >> >> > Problems worthy of attack prove their worth by hitting
back. -
>> Piet
>> >> >> Hein
>> >> >> >> > (via Tom White)
>> >> >> >>
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > // Jonathan Hsieh (shay)
>> >> > // Software Engineer, Cloudera
>> >> > // jon@cloudera.com
>> >>
>>

Mime
View raw message