hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: OOM when fetching all versions of single row
Date Fri, 31 Oct 2014 16:15:07 GMT
On Thu, Oct 30, 2014 at 8:20 AM, Andrejs Dubovskis <dubis.lv@gmail.com>

> Hi!
> We have a bunch of rows on HBase which store varying sizes of data
> (1-50MB). We use HBase versioning and keep up to 10000 column
> versions. Typically each column has only few versions. But in rare
> cases it may has thousands versions.
> The Mapreduce alghoritm uses full scan and our algorithm requires all
> versions to produce the result. So, we call scan.setMaxVersions().
> In worst case Region Server returns one row only, but huge one. The
> size is unpredictable and can not be controlled, because using
> parameters we can control row count only. And the MR task can throws
> OOME even if it has 50Gb heap.
> Is it possible to handle this situation? For example, RS should not
> send the raw to client, if the last has no memory to handle the row.
> In this case client can handle error and fetch each row's version in a
> separate get request.

See HBASE-11544 "[Ergonomics] hbase.client.scanner.caching is dogged and
will try to return batch even if it means OOME".

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message