hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Software Dev <static.void....@gmail.com>
Subject Re: Help with row and column design
Date Wed, 30 Apr 2014 02:45:06 GMT
Nothing against your code. I just meant that if we are doing a scan
say for hourly metrics across a 6 month period we are talking about
4K+ gets. Is that something that can easily be handled?

On Tue, Apr 29, 2014 at 5:08 PM, Rendon, Carlos (KBB) <CRendon@kbb.com> wrote:
>> Gets a bit hairy when doing say a shitload of gets thought.. no?
>
> If you by "hairy" you mean the code is ugly, it was written for maximal clarity.
> I think you'll find a few sensible loops makes it fairly clean.
> Otherwise I'm not sure what you mean.
>
> -----Original Message-----
> From: Software Dev [mailto:static.void.dev@gmail.com]
> Sent: Tuesday, April 29, 2014 5:02 PM
> To: user@hbase.apache.org
> Subject: Re: Help with row and column design
>
>> Yes. See total_usa vs. total_female_usa above. Basically you have to pre-store every
level of aggregation you care about.
>
> Ok I think this makes sense. Gets a bit hairy when doing say a shitload of gets thought..
no?
>
> On Tue, Apr 29, 2014 at 4:43 PM, Rendon, Carlos (KBB) <CRendon@kbb.com> wrote:
>> You don't do a scan, you do a series of gets, which I believe you can batch into
one call.
>>
>> last 5 days query in pseudocode
>> res1 = Get( hash("2014-04-29") + "2014-04-29")
>> res2 = Get( hash("2014-04-28") + "2014-04-28")
>> res3 = Get( hash("2014-04-27") + "2014-04-27")
>> res4 = Get( hash("2014-04-26") + "2014-04-26")
>> res5 = Get( hash("2014-04-25") + "2014-04-25")
>>
>> For each result you look for the particular column or columns you are
>> interested in Total_usa = res1.get("c:usa") + res2.get("c:usa") + res3.get("c:usa")
+ ...
>> Total_female_usa = res1.get("c:usa:sex:f") + ...
>>
>> "What happens when we add more fields? Do we just keep adding in more column qualifiers?
If so, how would we filter across columns to get an aggregate total?"
>>
>> Yes. See total_usa vs. total_female_usa above. Basically you have to pre-store every
level of aggregation you care about.
>>
>> -----Original Message-----
>> From: Software Dev [mailto:static.void.dev@gmail.com]
>> Sent: Tuesday, April 29, 2014 4:36 PM
>> To: user@hbase.apache.org
>> Subject: Re: Help with row and column design
>>
>>> The downside is it still has a hotspot when inserting, but when
>>> reading a range of time it does not
>>
>> How can you do a scan query between dates when you hash the date?
>>
>>> Column qualifiers are just the collection of items you are
>>> aggregating on. Values are increments. In your case qualifiers might
>>> look like c:usa, c:usa:sex:m, c:usa:sex:f, c:italy:sex:m,
>>> c:italy:sex:f, c:italy,
>>
>> What happens when we add more fields? Do we just keep adding in more column qualifiers?
If so, how would we filter across columns to get an aggregate total?

Mime
View raw message