hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Derek Wollenstein <de...@klout.com>
Subject Re: Key Design Question for list data
Date Wed, 04 Apr 2012 18:20:35 GMT
Ian --
  Thanks again for the clarifications.  When I mentioned the pagination
idea, I was proposing to put my own encoding format inside of the value, so
it would look like the following (using JSON just to make the format more
user123:0 ->
   "nextPage" = 1,
   "values" = [ "val1","val2","val3"/*, ... */ ]

So in that case, the user123 would occur once, vs
user123:val1 -> "'
user123:val2 -> ""
user123:val3 -> ""

In our case the number of values will be somewhere from the hundred to the
millions with a power-law distribution, so I think the savings could be
significant if we batch this in pages of 100.

That said, I also have heard about the upcoming prefix compression work in
hbase 0.94.  It's one of the things pushing me  towards the simpler format.
On Wed, Apr 4, 2012 at 11:15 AM, Ian Varley <ivarley@salesforce.com> wrote:

> Great, glad to help Derek! One clarification, on space usage: on disk, the
> row key (and column family) is repeated for every cell. So if you have rows
> like:
> user123:  columnfamily1: col1:val1, col2:val2, col3:val3
> user234:  columnfamily1: col1:val1, col2:val2, col3:val3
> What you really get on disk is:
> user123:columnfamily1:col1=val1
> user123:columnfamily1:col2=val2
> user123:columnfamily1:col3=val3
> user234:columnfamily1:col1=val1
> user234:columnfamily1:col2=val2
> user234:columnfamily1:col3=val3
> So, it really is roughly the same order of storage, whether you make it
> tall or wide. The key difference is that you get atomicity on a single row,
> so you get the benefits (and detriments) of that if you choose that
> approach.
> Note also that storefile compression *heavily* mitigates this repeating on
> disk (in my experience, you end up getting space usage on par with an RDBMS
> (or better) if you use compression). And prefix compression in memory is on
> its way, too; so it's important to try it out in practice and see how those
> things change the storage / cpu picture for you. :)
> Ian
> On Apr 4, 2012, at 1:01 PM, Derek Wollenstein wrote:
> Ian --
>  Thanks for the summary -- I think this concurs with the kind of
> questions I had.  It does seem like we would still save *some* space with a
> paginated implementation (user123 is repeated fewer times, even if
> everything else is stored the same number of times), but I agree that the
> savings may not be that large.  I was just trying to see if there was any
> strong experiences either way.  It's true that we can start simple, and
> then optimize.  My main concern is that reloading everything if we make a
> bad choice is somewhat expensive as well.  Thanks for the excellent summary
> of the issue.  I'll need to go ahead and talk with our other engineers
> about reverting to a simpler implementation.
> --Derek
> On Tue, Apr 3, 2012 at 12:59 PM, Ian Varley <ivarley@salesforce.com
> <mailto:ivarley@salesforce.com>> wrote:
> Hi Derek,
> If I understand you correctly, you're ultimately trying to store triples
> in the form "user, valueid, value", right? E.g., something like:
> "user123, firstname, Paul",
> "user234, lastname, Smith"
> (But the usernames are fixed width, and the valueids are fixed width).
> And, your access pattern is along the lines of: "for user X, list the next
> 30 values, starting with valueid Y". Is that right? And these values should
> be returned sorted by valueid?
> The tl;dr version is that you should probably go with one row per
> user+value, and not build a complicated intra-row pagination scheme on your
> own unless you're really sure it is needed.
> Your two options mirror a common question people have when designing HBase
> schemas: should I go "tall" or "wide"? Your first schema is "tall": each
> row represents one value for one user, and so there are many rows in the
> table for each user; the row key is user + valueid, and there would be
> (presumably) a single column qualifier that means "the value". This is
> great if you want to scan over rows in sorted order by row key (thus my
> question above, about whether these ids are sorted correctly). You can
> start a scan at any user+valueid, read the next 30, and be done. What
> you're giving up is the ability to have transactional guarantees around all
> the rows for one user, but it doesn't sound like you need that. Doing it
> this way is generally recommended (see here<
> http://hbase.apache.org/book.html#schema.smackdown>).
> Your second option is "wide": you store a bunch of values in one row,
> using different qualifiers (where the qualifier is the valueid). The simple
> way to do that would be to just store ALL values for one user in a single
> row. I'm guessing you jumped to the "paginated" version because you're
> assuming that storing millions of columns in a single row would be bad for
> performance, which may or may not be true; as long as you're not trying to
> do too much in a single request, or do things like scanning over and
> returning all of the cells in the row, it shouldn't be fundamentally worse.
> The client has methods that allow you to get specific slices of columns.
> Note that neither case fundamentally uses more disk space than the other;
> you're just "shifting" part of the identifying information for a value
> either to the left (into the row key, in option one) or to the right (into
> the column qualifiers in option 2). Under the covers, every key/value still
> stores the whole row key, and column family name. (If this is a bit
> confusing, take an hour and watch Lars George's excellent video about
> understanding HBase schema design:
> http://www.youtube.com/watch?v=_HLoH_PgrLk).
> A manually paginated version has lots more complexities, as you note, like
> having to keep track of how many things are in each page, re-shuffling if
> new values are inserted, etc. That seems significantly more complex. It
> might have some slight speed advantages (or disadvantages!) at extremely
> high throughput, and the only way to really know that would be to try it
> out. If you don't have time to build it both ways and compare, my advice
> would be to start with the simplest option (one row per user+value). Start
> simple & iterate! :)
> (Let me know if I've misunderstood your situation.)
> Ian
> On Apr 2, 2012, at 3:10 PM, Derek Wollenstein wrote:
> We're looking at how to store a large amount of (per-user) list data in
> hbase, and we were trying to figure out what kind of access pattern made
> the most sense.  One option is store the majority of the data in a key, so
> we could have something like
> <FixedWidthUserName><FixedWidthValueId1>:"" (no value)
> <FixedWidthUserName><FixedWidthValueId2>:"" (no value)
> <FixedWidthUserName><FixedWidthValueId3>:"" (no value)
> The other option we hade was to do this entirely using
> <FixedWidthUserName><FixedWidthPageNum0>:<FixedWidthLength><FixedIdNextPageNum><ValueId1><ValueId2><ValueId3>...
> <FixedWidthUserName><FixedWidthPageNum1>:<FixedWidthLength><FixedIdNextPageNum><ValueId1><ValueId2><ValueId3>...
> where each row would contain multiple values.
> So in one case reading the first thirty values would be
> scan { STARTROW => 'FixedWidthUsername' LIMIT => 30}
> And in the second case it would be
> get 'FixedWidthUserName\x00\x00\x00\x00'
> The general usage pattern would be to read only the first 30 values of
> these lists, with infrequent access reading deeper into the lists.  Some
> users would have <= 30 total values in these lists, and some users would
> have millions (i.e. power-law distribution)
> The single-value format seems like it would take up more space on hbase,
> but would offer some improved retrieval / pagination flexibility.  Would
> there be any significant performance advantages to be able to paginate via
> gets vs paginating with scans?
>  My initial understanding was that doing a scan should be faster if our
> paging size is unknown (and caching is set appropriately), but that gets
> should be faster if we'll always need the same page size.  I've ended up
> hearing different people tell me opposite things about performance.  I
> assume the page sizes would be relatively consistent, so for most use cases
> we could guarantee that we only wanted one page of data in the
> fixed-page-length case.  I would also assume that we would have infrequent
> updates, but may have inserts into the middle of these lists (meaning we'd
> need to update all subsequent rows).
> Thanks for help / suggestions / followup questions
> --Derek

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message