hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray" <jl...@streamy.com>
Subject RE: Many columns in 0.19
Date Wed, 11 Mar 2009 19:06:42 GMT
That is correct.  If you have 20 regions of a table which contains 10
families, you will have 200 HStores, 200 Memcaches, and some number of
HStoreFiles.

Families are very expensive in that way, almost like creating another table
except it can be read at the same time (in the same request) with other
families and under the same row key.

JG

> -----Original Message-----
> From: Puri, Aseem [mailto:Aseem.Puri@Honeywell.com]
> Sent: Wednesday, March 11, 2009 6:03 AM
> To: hbase-user@hadoop.apache.org
> Subject: RE: Many columns in 0.19
> 
> 
> Thanks JG and Schubert for sharing your knowledge.
> 
> One thing I want to ask that in a HRegionServer there are lots of
> region. So it means for every region have its HStore and correspond to
> that every Hstore have one memcache. Like earlier if there are 20
> region
> in HRegionServer and 10 column families in it, so it means there are 10
> HStore. So we have 20*10 total HStore?
> 
> Plz tell exactly whats happening, I am little bit confused?
> 
> -Aseem Puri
> 
> -----Original Message-----
> From: schubert zhang [mailto:zsongbo@gmail.com]
> Sent: Wednesday, March 11, 2009 8:36 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Many columns in 0.19
> 
> Cool, the HFile solution is what mentioned in Paper of Bigtable, it
> will
> be
> more efficient than MapFile.We are looking forward 0.20.0, including
> Bloom
> Filter.
> Thanks.
> 
> On Wed, Mar 11, 2009 at 2:28 AM, Jonathan Gray <jlist@streamy.com>
> wrote:
> 
> > Aseem,
> >
> > Almost!
> >
> > You will have 10 HStores as you say.  Each of those HStores is made
> up
> of a
> > single Memcache instance and zero or many MapFiles on HDFS.  Default
> block
> > size in HDFS is 64MB not 64k, so it could be a single block or many.
> >
> > Writes are done into the Memcache.  That is periodically flushed to
> HDFS
> > creating a single HStoreFile.  Multiple flushes will then yield
> multiples
> > HSFs.  Compactions and major compactions are run periodically to
> combine
> > these files into a single HStoreFile, for efficiency.
> >
> > In the upcoming 0.20 release we will move to a new HDFS file format
> called
> > HFile.  Within HFile, our data will be broken up into ~64k blocks
> > (configurable) but still stored in HDFS in 64M blocks (again,
> > configurable).
> >
> > JG
> >
> > > -----Original Message-----
> > > From: Puri, Aseem [mailto:Aseem.Puri@Honeywell.com]
> > > Sent: Monday, March 09, 2009 9:34 PM
> > > To: hbase-user@hadoop.apache.org
> > > Subject: RE: Many columns in 0.19
> > >
> > > Hi
> > >
> > > Thanks for help.
> > >
> > > So it means for a table if there are 10 column families then there
> are
> > > 10 HStore in a region and corresponding to it there are 10 map
> files.
> > > Mapfile further have blocks inside it of 64K are stored by HDFS.
> > >
> > > Am I right?
> > >
> > > -Aseem Puri
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Jonathan Gray [mailto:jlist@streamy.com]
> > > Sent: Monday, March 09, 2009 7:24 PM
> > > To: hbase-user@hadoop.apache.org
> > > Subject: RE: Many columns in 0.19
> > >
> > > A Table is made up of 1 to N HRegions and defined by its Column
> > > Families.
> > >
> > > Each HRegion is made up of an HStore per column family.  Each
> HStore
> is
> > > then
> > > made up of a single Memcache and 0 to M HStoreFiles.
> > >
> > > So, the HStore is one column family in one region.  It houses that
> > > families
> > > Memcache and HStoreFiles for that particular region.
> > >
> > > And yes, Bigtable stores one family of a region in one SSTable.
> The
> > > only
> > > caveat to that is that they offer "Locality Groups", as mentioned
> by
> > > Ryan,
> > > that group different families together in a single SSTable (or
> HStore
> > > in
> > > our
> > > case).  Changes in 0.20 leave the door open for HBase to also
> implement
> > > them
> > > but it is not currently on the roadmap.
> > >
> > > Hope that helps.
> > >
> > > JG
> > >
> > > > -----Original Message-----
> > > > From: Puri, Aseem [mailto:Aseem.Puri@Honeywell.com]
> > > > Sent: Monday, March 09, 2009 3:22 AM
> > > > To: hbase-user@hadoop.apache.org
> > > > Subject: RE: Many columns in 0.19
> > > >
> > > >
> > > > Hi
> > > >
> > > > I was reading Google BigTable article. Many thing oh hbase are
> > > similar
> > > > to Bigatable. But I cant understand the concept of HStore. Is
> HStore
> > > > means one column family in one map file?
> > > >
> > > > Is BigTable also store one column family in one SStable?
> > > >
> > > > -Aseem
> > > >
> > > > -----Original Message-----
> > > > From: Ryan Rawson [mailto:ryanobjc@gmail.com]
> > > > Sent: Monday, March 09, 2009 3:20 PM
> > > > To: hbase-user@hadoop.apache.org
> > > > Subject: Re: Many columns in 0.19
> > > >
> > > > Don't forget, each column family is another file on disk, and
> file
> > > > open.
> > > > Every column family is stored in it's own mapfile, and that
> increases
> > > > the
> > > > load on HDFS.
> > > >
> > > > This particular restriction won't ever really go away (unless we
> > > > introduce
> > > > locality groups, even then, each locality group = N families = 1
> > > file),
> > > > but
> > > > in 0.20 it should be more feasable to have thousands of columns
> per
> > > > family,
> > > > or more.
> > > >
> > > > -ryan
> > > >
> > > > On Mon, Mar 9, 2009 at 1:47 AM, Michael Dagaev
> > > > <michael.dagaev@gmail.com>wrote:
> > > >
> > > > > Thank you, Ryan
> > > > >
> > > > > On Mon, Mar 9, 2009 at 10:28 AM, Ryan Rawson
> <ryanobjc@gmail.com>
> > > > wrote:
> > > > > > Sadly this is still a limit.
> > > > > >
> > > > > > 0.20 should make things much better.
> > > > > >
> > > > > > -ryan
> > > > > >
> > > > > > On Mon, Mar 9, 2009 at 12:23 AM, Michael Dagaev <
> > > > > michael.dagaev@gmail.com>wrote:
> > > > > >
> > > > > >> Hi , all
> > > > > >>
> > > > > >>    I remember it was not recommended to add many columns
> (column
> > > > > >> qualifiers) in Hbase 0.18
> > > > > >> Does Hbase 0.19.0 still have this limitation?
> > > > > >>
> > > > > >> Thank you for your cooperation,
> > > > > >> M.
> > > > > >>
> > > > > >
> > > > >
> >
> >
> >


Mime
View raw message