hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Multiwal performance with HBase 1.x
Date Mon, 21 Sep 2015 11:35:23 GMT
Thanks for sharing, Yu. 

Images didn't go through. Can you use third party site for sharing ?

Cheers

> On Sep 20, 2015, at 11:09 PM, Yu Li <carp84@gmail.com> wrote:
> 
> Hi Vlad,
> 
> >> the existing write performance is more than adequate (avg load per RS usually
less than 1MB/sec)
> We have some different user scenarios and I'd like to share with you. We are using hbase
to store data for building search index, and features like pv/uv of each online item will
be recorded, so the write load would reach as high as 10MB/s (below is a screenshot of the
ganglia metrics data) per RS. OTOH, as a database I think the online write performance of
HBase is as important as read, bulkload is for offline and it cannot resolve all problem.
> 
> 
> Another advantage of using multiple wal is that we could do user/business level isolation
on wal. For example you could use one namespace per business and one wal group per namespace,
and you could replicate only the data for the business in need.
> 
> Regarding compaction IO, as I mentioned before, we could use tiered storage to prevent
compaction to affect wal sync. This way we've observed an obvious improvement on the avg mutate
RT, from 0.5ms to 0.3ms on our online cluster, FYI.
> 
> Best Regards,
> Yu
> 
>> On 19 September 2015 at 00:55, Vladimir Rodionov <vladrodionov@gmail.com> wrote:
>> Hi, Jingcheng
>> 
>> You postpone compaction until your test completes by setting number of
>> blocking stores to 120. That is kind of cheating :)
>> As I said previously, in a long run, compaction rules the world - not
>> number of wal files. In a real production setting, the existing write
>> performance
>> is more than adequate (avg load per RS usually less than 1MB/sec). Multiwal
>> has probably its value if someone need to load quick large volume of data,
>> but ... why do not use bulk load instead?
>> 
>> Thank for letting us know that beefy servers with 8 SSDs can sustain such a
>> huge load.
>> 
>> -Vlad
>> 
>> 
>> 
>> On Thu, Sep 17, 2015 at 10:30 PM, Jingcheng Du <jingcheng.du@intel.com>
>> wrote:
>> 
>> > More information for the test.
>> > I use ycsb 0.3.0 for the test.
>> > The command line is "./ycsb load hbase-10 -P ../workloads/workload -threads
>> > 200 -p columnfamily=family -p clientbuffering=true -s > workload.dat"
>> > The workload is, the data size is slightly less than 1TB:
>> > fieldcount=5
>> > fieldlength=200
>> > recordcount=1000000000
>> > maxexecutiontime=86400
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> > http://apache-hbase.679495.n3.nabble.com/Multiwal-performance-with-HBase-1-x-tp4074403p4074731.html
>> > Sent from the HBase User mailing list archive at Nabble.com.
>> >
> 

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message