hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lin Ma <lin...@gmail.com>
Subject Re: client cache for all region server information?
Date Thu, 23 Aug 2012 14:26:15 GMT
Harsh, thanks for the detailed information.

Two more comments,

1. I want to confirm my understanding is correct. At the beginning client
cache has nothing, when it issue request for a table, if the region server
location is not known, it will request from root META region to get region
server information step by step, then cache the region server information.
If cache already contain the requested region information, it will use
directly from cache. In this way, cache grows when cache miss for requested
region information;
2. "far outweighs the other items it caches (scan results, etc.)", you mean
GET API of HBase cache results? Sorry I am not aware of this feature
before. How the results are cached, and whether we can control it
(supposing a client is doing random read pattern, we do not want to cache
information since each read may be unique row-key access)? Appreciate if
you could point me to some more detailed information.

regards,
Lin

On Thu, Aug 23, 2012 at 9:35 PM, Harsh J <harsh@cloudera.com> wrote:

> Hi Lin,
>
> On Thu, Aug 23, 2012 at 4:31 PM, Lin Ma <linlma@gmail.com> wrote:
> > Thank you Abhishek,
> >
> > Two more comments,
> >
> > -- "Client only caches information as needed for its queries and not
> > necessarily for 'all' region servers." -- how did client know which
> region
> > server information is necessary to be cached in current HBase
> > implementation?
>
> What Abhishek meant here is that it caches only the needed table's
> rows from META. It also only caches the specific region required for
> the row you're looking up/operating on, AFAICT.
>
> > -- When the client loads region server information for the first time?
> Did
> > client persistent cache information at client side about region server
> > information?
>
> The client loads up regionserver information for a table, when it is
> requested to perform an operation on that table (on a specific row or
> the whole). It does not immediately, upon initialization, cache the
> whole of META's contents.
>
> Your question makes sense though, that it does seem to be such that a
> client *may* use quite a bit of memory space in trying to cache the
> META entries locally, but practically we've not had this cause issues
> for users yet. The amount of memory cached for META far outweighs the
> other items it caches (scan results, etc.). At least I have not seen
> any reports of excessive client memory usage just due to region
> locations of tables being cached.
>
> I think there's more benefits storing/caching it than not doing so,
> and so far we've not needed the extra complexity of persisting the
> cache to a local or non-RAM storage than keeping it in memory.
>
> --
> Harsh J
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message