hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jg...@facebook.com>
Subject RE: Data size
Date Thu, 01 Apr 2010 23:43:18 GMT
Matt,

Make your families a single character.  You get almost all the space savings as not duplicating
w/o any HBase changes.

As for row keys, since they will be duplicated within each block, even standard LZO compression
(not prefix compression) should do a decent job.  You could see 2-3X compression with long
duplicated keys.

In the end, you have a use case which does lots of duplication, and we're still only talking
about 2.2x.  It's not nothing, but I'm not sure that outweighs the costs associated with this
type of change.  By doing some fancier things in your application you could probably get half
way there.

And at the scale of even 250GB/month, then multiply by 3X replication, you're talking about
less than a 1TB drive each month.  That's like $100 a month.

Make sure you aren't hyper-optimizing before actually loading up your data, seeing how effective
compression is for you, trying to do things in your application to reduce size and duplication,
etc... If the changes in HBase were trivial to get the space savings and/or they would not
have other negative impacts on performance, we would go for space savings.  Unfortunately
this is not the case as I see it.

If you were interested in investigating prefix compression, let me know and we can try to
explore that... Could be a big win for your use case and many others.

JG

> -----Original Message-----
> From: Matt Corgan [mailto:mcorgan@hotpads.com]
> Sent: Thursday, April 01, 2010 3:01 PM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Data size
> 
> Jonathan - thanks for the detailed answer.  Ii'm sure implementing this
> stuff is a nightmare when trying minimize object instantiations.  But,
> since
> you mentioned it had been discussed before, here's a concrete example
> to
> throw some support behind non-duplication and prefix compression in
> future
> releases.
> 
> At hotpads.com, we host about 4.5mm real estate listings and archive
> statistics about them so that the homeowners can see how many people
> have
> viewed their home.  Each listing has a compound string key like
> "EquityResidential/VA45588438", and there are 26 integer statistics for
> each
> listing that we aggregate on a daily basis.  The statistics are counts
> of
> things like, displayed, previewed, viewed, mobileListingViewed,
> contactInfoViewed, emailed, etc...  They're in a table called
> DailyListingSummary.
> 
> - PK is something like "20100331/EquityResidential/VA45588438"
> - average column name length is 17 bytes
> -  we generate 100,000,000 rows per month.
> - On MySQL, a typical row is 39 byte key + 26 integers = ~143 bytes
> - 1 month of data is about 15 GB
> 
> We've been storing them in monthly partitioned MySQL tables, but schema
> changes are nearly impossible, and write speed is obviously not great.
>  We're considering moving several things to HBase, but the size
> inflation on
> this style of data is brutal.  I could compress the tables for some big
> disk
> savings, but my main concern is how much data fits in memory to serve
> user
> queries quickly.  I don't mind wasting disk, but I'm pretty aggressive
> about
> keeping stuff in memory.  I assume all the data is expanded in
> memory... is
> that correct?
> 
> On HBase with ColumnFamily name of "ManagerEventCounts", each cell
> would be
> ~100 bytes, and each row would be ~2553 bytes.  That's 18x inflation,
> which
> turns my 15gb monthly table into 255gb of raw data, and that's before
> replication.
> 
> a) If each cell didn't have to store the ColumnFamily name, we're down
> to
> 2085b (208gb/month or 15x)
> b) Stop duplicating the key and we're at 1110b (111gb/month or 8x)
> c) My application could map the column names to 1 byte and we're at
> 679b
> (67gb/month or 5x)
> d) Prefix compression would be a huge improvement on my bulky primary
> keys
> e) To shrink it further, I'd have to serialize rows into a single cell.
>  (handle schema changes in my app, as well as forfeit hbase increment
> functionality)
> 
> C and E need to be handled by my application, and D is probably
> difficult to
> implement, but A and B seem like they could have good bang for the
> buck.  In
> my case they shrink the data by 2.2x.  For workloads that judge the
> speed of
> HBase by random lookups on bigger-than-memory data, the cache
> effectiveness
> would be greatly improved.
> 
> 
> 
> 2010/3/31 Jonathan Gray <jgray@facebook.com>
> 
> > There are many implications related to this.  The core trade-off as I
> see
> > it is between storage and read performance.
> >
> > With the current setup, after we read blocks from HDFS into memory,
> we can
> > just usher KeyValues straight out of the on-disk format and to the
> client
> > without any further allocation or copies.  This is a highly desirable
> > property.
> >
> > If we were to only keep what was absolutely necessary (could not be
> > inferred or explicitly tracked in some way), then we would have to do
> a lot
> > of work at read time to regenerate client-friendly data.
> >
> > I'm not sure exactly what you mean by storing the row length at the
> > beginning of each row.  Families are certainly the easiest of these
> > optimizations to make but change read behavior significantly.  It has
> been
> > talked about and there's probably a jira hanging around somewhere.
> >
> > In the end, the HDFS/HBase philosophy is that disk/storage is cheap
> so we
> > should do what we can (within reason) for read performance.
> >
> > Much of this is mitigated by the use of compression.  Currently we
> only
> > utilize block compression (gzip default, lzo preferred).  BigTable
> uses a
> > special prefix-compression which is ideal for this duplication issue;
> maybe
> > one day we could do that too.
> >
> > JG
> >
> > > -----Original Message-----
> > > From: Matt Corgan [mailto:mcorgan@hotpads.com]
> > > Sent: Wednesday, March 31, 2010 7:06 PM
> > > To: hbase-user@hadoop.apache.org
> > > Cc: alex@cloudera.com; jlhuang@cs.nctu.edu.tw; kevin_hung@tsmc.com
> > > Subject: Re: Data size
> > >
> > > Out of curiousity, why is it necessary to store the family and row
> with
> > > every cell?  Aren't all the contents of a family confined to the
> same
> > > file,
> > > and couldn't a row length be stored at the beginning of each row or
> in
> > > a
> > > block index?  Is this true for values in the caches and memstore as
> > > well?
> > >
> > > It could have drastic implications for storing rows with many small
> > > values
> > > but with long keys, long column names, and innocently verbose
> column
> > > family
> > > names.
> > >
> > > Matt
> > >
> > > 2010/3/31 alex kamil <alex.kamil@gmail.com>
> > >
> > > > i would also suggest to chk dfs.*replication* setting in hdfs (in
> > > /conf/*
> > > > hdfs*-site.xml)
> > > >
> > > > A-K
> > > >
> > > > 2010/3/31 Jean-Daniel Cryans <jdcryans@apache.org>
> > > >
> > > > > HBase is column-oriented; every cell is stored with the row,
> > > family,
> > > > > qualifier and timestamp so every pieces of data will bring a
> larger
> > > > > disk usage. Without any knowledge of your keys, I can't comment
> > > much
> > > > > more.
> > > > >
> > > > > Then HDFS keeps a trash so every file compacted will end up
> > > there...
> > > > > if you just did the import, there will be a lot of these.
> > > > >
> > > > > Finally if you imported the data more than once, hbase keeps 3
> > > > > versions by default.
> > > > >
> > > > > So in short, is it reasonable? Answer: it depends!
> > > > >
> > > > > J-D
> > > > >
> > > > > 2010/3/31  <y_823910@tsmc.com>:
> > > > > > Hi,
> > > > > >
> > > > > > We've dumped oracele data to files then put these files into
> > > different
> > > > > > hbase table.
> > > > > > The size of these files is 35G; we saw the HDFS usage up to
> 562G
> > > after
> > > > > > putting it into hbase.
> > > > > > Is that reasonable?
> > > > > > Thanks
> > > > > >
> > > > > >
> > > > > >
> > > > > > Fleming Chiu(邱宏明)
> > > > > > 707-6128
> > > > > > y_823910@tsmc.com
> > > > > > 週一無肉日吃素救地球(Meat Free Monday Taiwan)
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >  ----------------------------------------------------------------
> ----
> > > -------
> > > > > >                                                         TSMC
> > > PROPERTY
> > > > > >  This email communication (and any attachments) is
> proprietary
> > > > > information
> > > > > >  for the sole use of its
> > > > > >  intended recipient. Any unauthorized review, use or
> distribution
> > > by
> > > > > anyone
> > > > > >  other than the intended
> > > > > >  recipient is strictly prohibited.  If you are not the
> intended
> > > > > recipient,
> > > > > >  please notify the sender by
> > > > > >  replying to this email, and then delete this email and any
> > > copies of
> > > > it
> > > > > >  immediately. Thank you.
> > > > > >
> > > > >
> > > >  ----------------------------------------------------------------
> ----
> > > -------
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> >

Mime
View raw message