hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Random I/O performance
Date Wed, 26 Oct 2011 23:30:58 GMT
On Wed, Oct 26, 2011 at 4:13 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>> Off-heap cache is experimental in 0.92 and TRUNK.
> As of now, TestSlabCache passes consistently in 0.92 and TRUNK.
>
> Li Pi's slides from Aug can be found here:
> https://docs.google.com/present/view?id=d23xkzr_55hgnvngf6
> Toward the end of it, you can find performance chart.

Those were micro-benchmarks, though. I tried doing some tests on a
large cluster and wasn't able to get great performance out of it. But,
that was on an earlier version of the patch, and the later fixes could
have also fixed the perf issue. Hopefully some other folks can try it
out and provide feedback!

-Todd

>
> On Wed, Oct 26, 2011 at 3:49 PM, Stack <stack@duboce.net> wrote:
>
>> On Wed, Oct 26, 2011 at 2:50 PM, Vladimir Rodionov
>> <vrodionov@carrieriq.com> wrote:
>> >>> Are you hitting cache at all?
>> >>
>> >> Its totally random, due to the proposed key design which favored fast
>> inserts. Keys are randomized
>> >> values, that is why there is no data locality in row look ups. Effect of
>> the cache (LruBlockCache?) is negligible
>> >> in this case.
>> >>
>> >
>> >>>So a different schema would get cache into the mix?
>> >
>> > You can/t change schema while system is in production
>> >
>>
>> True but caveat Ted's note and FB fellas apparently did it three times
>> before they hit on the 'right' schema (Not sure whether they took the
>> portion being modified offline when changing schema)
>>
>> >
>> >>>Its going to keep growing without bound?
>> >
>> >
>> > No, we keep data for XX days than purge stale data from the table.
>> >
>> >
>> > My question was: what else besides obvious -run all in parallel - can
>> help to improve random I/O?
>> >
>> > 1. Will BLOOM filter help to optimize HBase Read path?
>>
>> Yes.  0.92 blooms will be less expensive than those in 0.90 (because
>> the blooms are tiered and live in the LRU in 0.92 so they are let go
>> if unused).
>>
>>
>> > 2. We use compression already.
>> > 3. Block size - does it really matter much?
>>
>> Not much in my experience.  Smaller blocks can help a little at the
>> cost of some bloat in index size (Again 0.92 is better here because
>> indices are partitioned and now also are in the LRU rather than pegged
>> in RAM as they are in 0.90).
>>
>> > 4. Off heap block cache? Its in 92 trunk? Have anybody performed real
>> performance tests on Off heap cache?
>> >
>>
>> Off-heap cache is experimental in 0.92 and TRUNK.
>>
>> St.Ack
>>
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message