hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Boisvert <>
Subject Re: Help with last 30 day unique user query
Date Fri, 15 Oct 2010 17:51:13 GMT
As far as I know, Hive has no built-in support for sliding-window analytics.
 There is an enhancement request here:

<>Without such support, the
brute force way of doing things is,

SELECT COUNT(DISTINCT user_id) FROM events WHERE event_date > start_date
and event_date <= end_date;

(repeated N times to cover each day of your time window).


On Thu, Oct 14, 2010 at 11:36 PM, Vijay <> wrote:

> Hi, I need help with this scenario. We have a table of events which has
> columns date, event (not important for this discussion), and user_id. It is
> obviously easy to find number of unique users for each day. I also need to
> find number of unique users in the last 30 days for each day. This is also
> quite simple to do for one day. However, I cannot figure out how to do this
> for a range of days. Something like this is pretty straightforward in most
> RDBMS but with HiveQL has I'm finding this hard. I might be missing
> something simple though. Any help is appreciated. Ideally the query should
> also be as optimized as possible as this table could be huge.
> Thanks,
> Vijay

View raw message