hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Pi ...@cloudera.com>
Subject Re: Generic Schema Question
Date Mon, 15 Aug 2011 20:33:36 GMT
You can do a range scan for 192.168.1.2/1313280451 to 192.168.1.2/1313281242
.

Do setbatch to 100.

Alternatively, you can just use the IP as the key alone, and let hbase keep
track of versions. Set maxversions to an Integer.MAX when creating the
column, and just do a get of 192.168.1.2 with
*setMaxVersions<http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html#setMaxVersions(int)>
*(int maxVersions) with maxversions = 100.


On Sat, Aug 13, 2011 at 5:16 PM, Mark <static.void.dev@gmail.com> wrote:

> Ok so something like this?
>
> row                                   cf:qual           value
> ------------------------------**-----------
> 192.168.1.2/1313280451 data:page      "/foo/bar"
> 192.168.1.2/1313280451 data:referrer  "google.com"
> 192.168.1.2/1313280451 data:session  "**f306e5af69b48568323fdc3018e40e**
> 7e"
>
> ------------------------------**-----------
> 192.168.1.2/1313281242 data:page "/foo/baz"
> 192.168.1.2/1313281242 data:page ""
> 192.168.1.2/1313281242 data:page "**f306e5af69b48568323fdc3018e40e**7e"
> ....
>
> Will this allow me to query the last 100 rows for ip "192.168.1.2". If so,
> how? Will it be efficient? Also, would you mind explaining an alternative
> way of accomplishing this as I'm still trying to figure out all the
> possibilities.
>
> Thanks again
>
>
>
> On 8/13/11 4:53 PM, Blake Lemoine wrote:
>
>> You need to have the ip address followed by a slash followed by the time
>> as
>> the row key.  Or some other such a way of getting multiple rows per ip.
>> Then you could scan for the ip prefix.  Of course that's just one possible
>> solution.
>> On Aug 13, 2011 1:01 PM, "Mark"<static.void.dev@gmail.**com<static.void.dev@gmail.com>>
>>  wrote:
>>
>>> Hi all, I'm trying to wrap my head around HBase schema design and I am
>>> having trouble modeling the following use case:
>>>
>>> We store all our use behavior (clicks, searches, page views) in Hadoop
>>> and we would like to add this into HBase so we can interactively
>>> "explore" what our users are doing. For example we would like, given an
>>> IP address get back a list of all searches, page views, clicks etc that
>>> this user has attempted.
>>>
>>> My initial thought for something like this would be to create a table
>>> "Logs" with a CF "Data" that have qualifiers of "Search", "Click" and
>>> "View". Each column would have a row with the IP as its key.
>>>
>>> Is this along the right lines or am I missing something... sure feels
>>> like I am. Would anyone please explain how I would accomplish what I am
>>> looking for.
>>>
>>> Thanks
>>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message