hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Heng Chen <heng.chen.1...@gmail.com>
Subject Re: hbase rowkey design
Date Mon, 16 May 2016 10:17:19 GMT
In my company, we calculate UV/PV offline in batch, and update every day.

If do it online, url + timestamp could be the rowkey.



2016-05-16 18:13 GMT+08:00 齐忠 <centerqi@gmail.com>:

> Yes, like google analytics.
>
> 2016-05-16 17:48 GMT+08:00 Heng Chen <heng.chen.1986@gmail.com>:
> > You want to calculate UV/PV online?
> >
> > 2016-05-16 16:46 GMT+08:00 齐忠 <centerqi@gmail.com>:
> >
> >> I have very large log(50T per day),
> >>
> >> My log event as follows
> >>
> >> url,visitid,requesttime
> >>
> >> http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380
> >> http://www.aaa.com?a=b&c=d&e=fa, 1, 1463387280
> >> http://www.aaa.com?a=b&c=d&e=fa, 2, 1463387280
> >> http://www.aaa.com?a=b&c=d&e=fab, 2, 1463387280
> >> http://www.aaa.com?a=b&c=d&e=f, 1, 1463387380
> >>
> >>
> >> When a user enters a part of the url, and returns the
> >> uv(UniqueVisitor) pv(PageView)。
> >>
> >> for example
> >>
> >> input: e=f*
> >>
> >> output: uv=2,pv=5,
> >>
> >> input: e=fa
> >>
> >> output:uv=2,pv=3
> >>
> >> How to design rowkey?
> >>
> >> Thanks.
> >>
>
>
>
> --
> centerqi@gmail.com|齐忠
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message