hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hari Sreekumar <hsreeku...@clickable.com>
Subject Re: Clarification regarding HBase reads
Date Mon, 21 Feb 2011 03:50:53 GMT
So if I have 10 tables each with 2 families, I'd open up 20 stores whenever
I open a region for reading? Is it a problem to have too many tables. e.g,
if I have 1 big table and 4 indexing tables for the big table? Are there any
potential issues with this?

Thanks,
Hari

On Sun, Feb 20, 2011 at 8:47 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> >> Does this mean that a store instance is opened for all tables present in
> >> HBase irrespective of which table we are querying and for all
> >> columnfamilies?
> No. The blog says Store instance is for each family.
>
> You should generally avoid multiple column families. But we can help you
> analyze your use case.
> If you read through https://issues.apache.org/jira/browse/HBASE-3149, you
> would better understand current implementation.
>
> On Sun, Feb 20, 2011 at 6:38 AM, Hari Sreekumar <hsreekumar@clickable.com
> >wrote:
>
> > Hi,
> >
> > I was going through the HBase architecture blog by Lars George (
> > http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html)
> and
> > I just wanted a clarification regarding how HBase reads data. The blog
> > mentions that :
> >
> > Next the HRegionServer opens the region it creates a corresponding
> > HRegion object.
> > When the HRegion is "opened" it sets up a Store instance for each
> > HColumnFamily for every table as defined by the user beforehand. Each of
> > the Store instances can in turn have one or more StoreFile instances,
> which
> > are lightweight wrappers around the actual storage file called HFile. A
> > HRegion also has a MemStore and a HLog instance. We will now have a look
> at
> > how they work together but also where there are exceptions to the rule.
> >
> > Does this mean that a store instance is opened for all tables present in
> > HBase irrespective of which table we are querying and for all
> > columnfamilies? Is this why I generally see people avoiding large number
> of
> > tables/large number of column families. If not, what is the reason for
> > that?
> > Is it true at all that we should avoid too many tables/CFs ?
> >
> > Thanks,
> > Hari
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message