hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vijay <>
Subject Re: Help with last 30 day unique user query
Date Fri, 15 Oct 2010 18:19:52 GMT
Thanks Alex! That is exactly what I thought was the limitation but wanted to
make sure I'm not missing anything.

On Fri, Oct 15, 2010 at 10:51 AM, Alex Boisvert <> wrote:

> As far as I know, Hive has no built-in support for sliding-window
> analytics.  There is an enhancement request here:
> <>Without such support, the
> brute force way of doing things is,
> SELECT COUNT(DISTINCT user_id) FROM events WHERE event_date > start_date
> and event_date <= end_date;
> (repeated N times to cover each day of your time window).
> alex
> On Thu, Oct 14, 2010 at 11:36 PM, Vijay <> wrote:
>> Hi, I need help with this scenario. We have a table of events which has
>> columns date, event (not important for this discussion), and user_id. It is
>> obviously easy to find number of unique users for each day. I also need to
>> find number of unique users in the last 30 days for each day. This is also
>> quite simple to do for one day. However, I cannot figure out how to do this
>> for a range of days. Something like this is pretty straightforward in most
>> RDBMS but with HiveQL has I'm finding this hard. I might be missing
>> something simple though. Any help is appreciated. Ideally the query should
>> also be as optimized as possible as this table could be huge.
>> Thanks,
>> Vijay

View raw message