hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Beaudreault <bbeaudrea...@hubspot.com>
Subject Re: Scan.setMaxResultSize and Result.isPartial
Date Sat, 18 Jun 2016 14:59:45 GMT
Thanks Enis, I had forgotten about that post! All of this makes sense now

On Fri, Jun 17, 2016 at 10:12 PM Enis Söztutar <enis@apache.org> wrote:

> You should probably read
> https://blogs.apache.org/hbase/entry/scan_improvements_in_hbase_1 first.
>
> In HBase-1.1 and later code bases, you can call Scan.allowPartialResults()
> to instruct the ClientScanner to give you partial results. In this case,
> you can use Result.isPartial() to stitch together multiple Result objects
> into a single row. Unless you explicitly request it, Results returned will
> never be partial results. Why would you want to call
> Scan.allowPartialResults() in the first place? It is because of client-side
> memory allocation. If you have a row with millions of columns and GBs of
> data let's say, you cannot afford to have the ClientScanner to auto-stitch
> all the column values for you and give a single Result object, because it
> will cause OOM.
>
> Hope this helps.
> Enis
>
> On Fri, Jun 17, 2016 at 4:15 PM, Bryan Beaudreault <
> bbeaudreault@hubspot.com
> > wrote:
>
> > Hello,
> >
> > We are running 1.2.0-cdh5.7.0 on our server side, and 1.0.0-cdh5.4.5 on
> the
> > client side. We're in the process of upgrading the client, but aren't
> there
> > yet. I'm trying to figure out the relationship of Result.isPartial and
> the
> > user, when setMaxResultSize is used.
> >
> > I've done a little reading of the code, and it looks like isPartial is
> > mostly used by the internals of ClientScanner. From what I can tell the
> > user should never get a Result where isPartial == true, because the
> > ClientScanner will do multiple requests internally to flesh out
> incomplete
> > rows.
> >
> > However, the code is a bit complex so I'd like to verify. Is this correct
> > for either version of HBase above? Is it safe to use setMaxResultSize
> > without any more work, or should we be handling the potential isPartial()
> > Result ourselves in every scan request we make?
> >
> > I wonder if this should be added to the docs, either way (didn't see it),
> > or remove isPartial from the public API in future versions?
> >
> > Thanks!
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message