hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lars <lhofha...@yahoo.com>
Subject Re: Early comparisons between 0.90 and 0.92
Date Thu, 15 Dec 2011 19:35:28 GMT
Do you see the same slowdown with the default 64k block size?

Lars <lhofhansl@yahoo.com> schrieb:

>I'll be busy today... I'll double check my scanning related changes as soon as i can.
>
>Jean-Daniel Cryans <jdcryans@apache.org> schrieb:
>
>>Yes and yes.
>>
>>J-D
>>On Dec 14, 2011 5:52 PM, "Matt Corgan" <mcorgan@hotpads.com> wrote:
>>
>>> Regions are major compacted and have empty memstores, so no merging of
>>> stores when reading?
>>>
>>>
>>> 2011/12/14 Jean-Daniel Cryans <jdcryans@apache.org>
>>>
>>> > Yes sorry 1.1M
>>> >
>>> > This is PE, the table is set to a block size of 4KB and block caching
>>> > is disabled. Nothing else special in there.
>>> >
>>> > J-D
>>> >
>>> > 2011/12/14  <yuzhihong@gmail.com>:
>>> > > Thanks for the info, J-D.
>>> > >
>>> > > I guess the 1.1 below is in millions.
>>> > >
>>> > > Can you tell us more about your tables - bloom filters, etc ?
>>> > >
>>> > >
>>> > >
>>> > > 在 Dec 14, 2011,5:26 PM,Jean-Daniel Cryans <jdcryans@apache.org>
写道:
>>> > >
>>> > >> Hey guys,
>>> > >>
>>> > >> I was doing some comparisons between 0.90.5 and 0.92.0, mainly
>>> > >> regarding reads. The numbers are kinda irrelevant but the differences
>>> > >> are. BTW this is on CDH3u3 with random reads.
>>> > >>
>>> > >> In 0.90.0, scanning 50M rows that are in the OS cache I go up to
about
>>> > >> 1.7M rows scanned per second.
>>> > >>
>>> > >> In 0.92.0, scanning those same rows (meaning that I didn't run
>>> > >> compactions after migrating so it's picking the same data from
the OS
>>> > >> cache), I scan about 1.1 rows per second.
>>> > >>
>>> > >> 0.92 is 50% slower when scanning.
>>> > >>
>>> > >> In 0.90.0 random reading 50M rows that are OS cached I can do about
>>> > >> 200k reads per second.
>>> > >>
>>> > >> In 0.92.0, again with those same rows, I can go up to 260k per
second.
>>> > >>
>>> > >> 0.92 is 30% faster when random reading.
>>> > >>
>>> > >> I've been playing with that data set for a while and the numbers
in
>>> > >> 0.92.0 when using HFileV1 or V2 are pretty much the same meaning
that
>>> > >> something else changed or the code that's generic to both did.
>>> > >>
>>> > >>
>>> > >> I'd like to be able to associate those differences to code changes
in
>>> > >> order to understand what's going on. I would really appreciate
if
>>> > >> others also took some time to test it out or to think about what
could
>>> > >> cause this.
>>> > >>
>>> > >> Thx,
>>> > >>
>>> > >> J-D
>>> >
>>>
Mime
View raw message