hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Server-side write buffer configuration
Date Mon, 09 Aug 2010 19:51:01 GMT
HBASE-2066 was committed and it will be automatically in function when
using the write buffer starting with version 0.89, eg this contains it
http://hbase.apache.org/docs/r0.89.20100621/

Using more than 1 clients is basically starting more of them, the same
way you started the first one. Your input data can then be split
between the clients, using either MapReduce or your homegrown solution
(we imported the stumbles with a MR job).

J-D

On Mon, Aug 9, 2010 at 12:44 PM, Han Liu <hanl1@andrew.cmu.edu> wrote:
> Thanks for your reply J-D!
>
> Could you explain more about the HBase-2066 schema? For example how did you do the first
3 steps described on that page?
>
> Also is there any documentation that describes the multiple-client in HBase?
>
>
> On Aug 9, 2010, at 2:37 PM, Jean-Daniel Cryans wrote:
>
>> That's pretty powerful machines, I would expect more performance. You
>> could try using the same settings that we do here, checkout ryan's
>> presentation, page 16:
>> http://people.apache.org/~jdcryans/HUG8/HUG8-rawson.pdf
>>
>> Google "IO wait" to learn about it.
>>
>> Multi-clients will be faster unless you are already maxing out the
>> machines (betting 100$ you're not), it's like asking if doing parallel
>> processing will be faster than sequential processing.
>>
>> J-D
>>
>> On Mon, Aug 9, 2010 at 11:21 AM, Han Liu <hanl1@andrew.cmu.edu> wrote:
>>> Thanks for the reply J-D.
>>> In-lined.  :)
>>>
>>>
>>> On Aug 9, 2010, at 1:57 PM, Jean-Daniel Cryans wrote:
>>>
>>>> Hard to tell if it's decent performance. How do you define "decent"?
>>> I consider it descent if it is roughly the best performance one can get using
my schema on my machines
>>>> What kind of hardware are we talking about?
>>> One machine for HBase master and 6 regionservers. Specs of each of these machines:
>>> 16 GB Ram
>>> 4 1TB 7200RPM SATA Drives
>>> 10 Gb Network: 1x Qlogic QLE3142-Cu-CK
>>> CPU: 2x quad-core E5440 (2.83GHz, 12MB L2 cache, 1333 MHz FSB)
>>>> Which version are you
>>>> using?
>>> 0.20.4
>>>> How much memory was given to HBase?
>>> 6 GB
>>>>
>>>> Also did you set the write buffer on the client side on HTable?
>>> Yes I set it to be "1024*1024*12" bytes
>>>> Did
>>>> you also turn off auto-flushing?
>>> Yes it's turned off
>>>> Do you monitor your cluster? If so,
>>>> do you see lots if IO wait?
>>> I didn't.. What do IO waits indicate?
>>>>
>>>> And finally, do you use a single client or multiple ones?
>>>>
>>> Single client. Will multiple client boost performance?
>>>> :)
>>>>
>>> :) :)
>>>
>>> Thanks a lot.
>>>> J-D
>>>>
>>>> On Mon, Aug 9, 2010 at 10:46 AM, Han Liu <hanl1@andrew.cmu.edu> wrote:
>>>>> Hi Guys,
>>>>>
>>>>> I know on the client side of HBase there's a configuration" hbase.client.write.buffer".
I wonder if there's a similar configuration on the region server side that i can tweak to
adjust performance?
>>>>>
>>>>> Also as of right now I have managed to insert 15gb data to a 6-regionserver
HBase database in roughly 26 minutes using the "table.put(Put p)" schema. Generally is this
a decent performance?
>>>>>
>>>>> Any advice would be appreciated. Thanks a lot in advance.
>>>>> --
>>>>> Han Liu
>>>>> SCS & HCI Institute
>>>>> Undergrad. Class of 2012
>>>>> Carnegie Mellon University
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> --
>>> Han Liu
>>> SCS & HCI Institute
>>> Undergrad. Class of 2012
>>> Carnegie Mellon University
>>>
>>>
>>>
>>>
>>>
>>
>
> --
> Han Liu
> SCS & HCI Institute
> Undergrad. Class of 2012
> Carnegie Mellon University
>
>
>
>
>

Mime
View raw message