hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristoffer Sjögren <sto...@gmail.com>
Subject Re: Rowkey design question
Date Tue, 07 Apr 2015 21:22:11 GMT
Sorry I should have explained my use case a bit more.

Yes, it's a pretty big row and it's "close" to worst case. Normally there
would be fewer qualifiers and the largest qualifiers would be smaller.

The reason why these rows gets big is because they stores aggregated data
in indexed compressed form. This format allow for extremely fast queries
(on local disk format) over billions of rows (not rows in HBase speak),
when touching smaller areas of the data. If would store the data as regular
HBase rows things would get very slow unless I had many many region servers.

The coprocessor is used for doing custom queries on the indexed data inside
the region servers. These queries are not like a regular row scan, but very
specific as to how the data is formatted withing each column qualifier.

Yes, this is not possible if HBase loads the whole 500MB each time i want
to perform this custom query on a row. Hence my question :-)

On Tue, Apr 7, 2015 at 11:03 PM, Michael Segel <michael_segel@hotmail.com>

> Sorry, but your initial problem statement doesn’t seem to parse …
> Are you saying that you a single row with approximately 100,000 elements
> where each element is roughly 1-5KB in size and in addition there are ~5
> elements which will be between one and five MB in size?
> And you then mention a coprocessor?
> Just looking at the numbers… 100K * 5KB means that each row would end up
> being 500MB in size.
> That’s a pretty fat row.
> I would suggest rethinking your strategy.
> > On Apr 7, 2015, at 11:13 AM, Kristoffer Sjögren <stoffe@gmail.com>
> wrote:
> >
> > Hi
> >
> > I have a row with around 100.000 qualifiers with mostly small values
> around
> > 1-5KB and maybe 5 largers ones around 1-5 MB. A coprocessor do random
> > access of 1-10 qualifiers per row.
> >
> > I would like to understand how HBase loads the data into memory. Will the
> > entire row be loaded or only the qualifiers I ask for (like pointer
> access
> > into a direct ByteBuffer) ?
> >
> > Cheers,
> > -Kristoffer
> The opinions expressed here are mine, while they may reflect a cognitive
> thought, that is purely accidental.
> Use at your own risk.
> Michael Segel
> michael_segel (AT) hotmail.com

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message