hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bradford Stephens <bradfordsteph...@gmail.com>
Subject Re: Slow Inserts on EC2 Cluster
Date Thu, 02 Sep 2010 00:37:28 GMT
Yeah, those families are all needed -- but I didn't realize the files
were so small. That's odd -- and you're right, that'd certainly throw
it off. I'll merge them all and see if that helps.

On Wed, Sep 1, 2010 at 5:24 PM, Jean-Daniel Cryans <jdcryans@apache.org> wrote:
> Took a quick look at your RS log, it looks like you are using a lot of
> families and loading them pretty much at the same rate. Look at lines
> that start with:
>
> INFO org.apache.hadoop.hbase.regionserver.Store: Added ...
>
> And you will see that you are dumping very small files on the
> filesystem, on average 5MB, that together account for ~64MB which is
> the default flush size (and then it generates tons of compactions
> which makes it even worse). Do you really need all those families? Try
> merging them and see the difference.
>
> J-D
>
> On Wed, Sep 1, 2010 at 5:03 PM, Bradford Stephens
> <bradfordstephens@gmail.com> wrote:
>> 'allo,
>>
>> I changed the cluster form m1.large to c1.xlarge -- we're getting
>> about 4k inserts /node / minute instead of 2k. A small improvement,
>> but nowhere near what I'm used to, even from vague memories of old
>> clusters on EC2.
>>
>> I also stripped all the Cascading from my code and have a very basic
>> raw MR job -- we're basically reading raw text, splitting it into
>> fields, and adding those rows to HBase. About the simplest task you
>> could do.
>>
>> Ideas for next steps? What other info could I share?
>>
>> Cheers,
>> B
>>
>> On Wed, Sep 1, 2010 at 10:55 AM, Andrew Purtell <apurtell@apache.org> wrote:
>>>> From: Gary Helmling
>>>>
>>>> If you're using AMIs based on the latest Ubuntu (10.4),
>>>> theres a known kernel issue that seems to be causing
>>>> high loads while idle.  More info here:
>>>>
>>>> https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/574910
>>>
>>> Seems best to avoid using Lucid on EC2 for now, then.
>>>
>>> FYI, the EC2 scripts that I use build AMIs based on Amazon's old FC8 AMI (with
updates). See http://github.com/apurtell/hbase-ec2
>>>
>>>  - Andy
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Bradford Stephens,
>> Founder, Drawn to Scale
>> drawntoscalehq.com
>> 727.697.7528
>>
>> http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
>> solution. Process, store, query, search, and serve all your data.
>>
>> http://www.roadtofailure.com -- The Fringes of Scalability, Social
>> Media, and Computer Science
>>
>



-- 
Bradford Stephens,
Founder, Drawn to Scale
drawntoscalehq.com
727.697.7528

http://www.drawntoscalehq.com --  The intuitive, cloud-scale data
solution. Process, store, query, search, and serve all your data.

http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science

Mime
View raw message