hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Tse <Mark....@D2L.com>
Subject RE: RegionServer - Insufficient Memory and Cascading Errors
Date Mon, 09 Mar 2015 16:54:36 GMT
How large are those 50 cells?
- I think the max size is 8000 bytes

Do you have multiple versions enabled and are requesting that data as well?
- Enabled, but only one version exists

Can you request only some columns or do you need the whole row every time?
- I can, but I'm concerned that all the RegionServers are failing when attempting to scan
the entire table. If this happens by accident on a live system, we'll be in trouble. 

Are you specifying a value for scanner caching?
- No

Do you have suspicions on why the failure's occurring? I'm mostly interested in preventing
this from happening on a live system.

Thanks,
Mark

-----Original Message-----
From: Nick Dimiduk [mailto:ndimiduk@gmail.com] 
Sent: March-09-15 12:29 PM
To: user@hbase.apache.org
Subject: Re: RegionServer - Insufficient Memory and Cascading Errors

How large are those 50 cells? Do you have multiple versions enabled and are requesting that
data as well? Can you request only some columns or do you need the whole row every time? Are
you specifying a value for scanner caching?

All this is for managing the amount of data materialized in a single scan RPC. It's painful,
I admit. We have an open issue to alleviate all this nonsense, but it remains a work in progress
(the JIRA number escapes me however).

On Friday, March 6, 2015, Mark Tse <Mark.Tse@d2l.com> wrote:

> Hi everyone,
>
> When I do a scan on a table with about 700 rows (about 50 columns 
> each), the RegionServers will systematically go offline one at a time 
> until all the RegionServers are offline. This is probably due to there 
> not being enough memory available for the RegionServer processes (we 
> are working with sub-1G for our max heap size on our test clusters atm).
>
> Increasing the max heap size for the RegionServers alleviates this 
> problem. However, my concern is that this kind of cascading failure 
> occurs on production with large datasets even with a larger heap size.
>
> What steps can I take to prevent this kind of cascading error? Is 
> there a way to configure RegionServers to return an error instead of 
> just failing (and causing HBase Master to hand the task to the next 
> available RegionServer)?
>
> Thanks,
> Mark
>
>
Mime
View raw message