hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Software Dev <static.void....@gmail.com>
Subject Re: Help with row and column design
Date Wed, 30 Apr 2014 00:02:09 GMT
> Yes. See total_usa vs. total_female_usa above. Basically you have to pre-store every level
of aggregation you care about.

Ok I think this makes sense. Gets a bit hairy when doing say a
shitload of gets thought.. no?

On Tue, Apr 29, 2014 at 4:43 PM, Rendon, Carlos (KBB) <CRendon@kbb.com> wrote:
> You don't do a scan, you do a series of gets, which I believe you can batch into one
call.
>
> last 5 days query in pseudocode
> res1 = Get( hash("2014-04-29") + "2014-04-29")
> res2 = Get( hash("2014-04-28") + "2014-04-28")
> res3 = Get( hash("2014-04-27") + "2014-04-27")
> res4 = Get( hash("2014-04-26") + "2014-04-26")
> res5 = Get( hash("2014-04-25") + "2014-04-25")
>
> For each result you look for the particular column or columns you are interested in
> Total_usa = res1.get("c:usa") + res2.get("c:usa") + res3.get("c:usa") + ...
> Total_female_usa = res1.get("c:usa:sex:f") + ...
>
> "What happens when we add more fields? Do we just keep adding in more column qualifiers?
If so, how would we filter across columns to get an aggregate total?"
>
> Yes. See total_usa vs. total_female_usa above. Basically you have to pre-store every
level of aggregation you care about.
>
> -----Original Message-----
> From: Software Dev [mailto:static.void.dev@gmail.com]
> Sent: Tuesday, April 29, 2014 4:36 PM
> To: user@hbase.apache.org
> Subject: Re: Help with row and column design
>
>> The downside is it still has a hotspot when inserting, but when
>> reading a range of time it does not
>
> How can you do a scan query between dates when you hash the date?
>
>> Column qualifiers are just the collection of items you are aggregating
>> on. Values are increments. In your case qualifiers might look like
>> c:usa, c:usa:sex:m, c:usa:sex:f, c:italy:sex:m, c:italy:sex:f,
>> c:italy,
>
> What happens when we add more fields? Do we just keep adding in more column qualifiers?
If so, how would we filter across columns to get an aggregate total?

Mime
View raw message