hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Buttler, David" <buttl...@llnl.gov>
Subject RE: Generic Schema Question
Date Mon, 15 Aug 2011 20:12:51 GMT
If you are interested in the most recent 100 transactions, instead of using currentTimeMillis()
as part of your key, you can use Long.MAX_VALUE-System.currentTimeMillis().  That way new
entries get put at the top.  Then you can have a start row of your scan to be "192.168.1.2"
and the first result will be the most recent entry.  You can then just scan for 100 rows and
get all of what you want.

Dave

-----Original Message-----
From: Mark [mailto:static.void.dev@gmail.com] 
Sent: Saturday, August 13, 2011 5:16 PM
To: user@hbase.apache.org
Subject: Re: Generic Schema Question

Ok so something like this?

row                                   cf:qual           value
-----------------------------------------
192.168.1.2/1313280451 data:page      "/foo/bar"
192.168.1.2/1313280451 data:referrer  "google.com"
192.168.1.2/1313280451 data:session  "f306e5af69b48568323fdc3018e40e7e"

-----------------------------------------
192.168.1.2/1313281242 data:page "/foo/baz"
192.168.1.2/1313281242 data:page ""
192.168.1.2/1313281242 data:page "f306e5af69b48568323fdc3018e40e7e"
....

Will this allow me to query the last 100 rows for ip "192.168.1.2". If 
so, how? Will it be efficient? Also, would you mind explaining an 
alternative way of accomplishing this as I'm still trying to figure out 
all the possibilities.

Thanks again


On 8/13/11 4:53 PM, Blake Lemoine wrote:
> You need to have the ip address followed by a slash followed by the time as
> the row key.  Or some other such a way of getting multiple rows per ip.
> Then you could scan for the ip prefix.  Of course that's just one possible
> solution.
> On Aug 13, 2011 1:01 PM, "Mark"<static.void.dev@gmail.com>  wrote:
>> Hi all, I'm trying to wrap my head around HBase schema design and I am
>> having trouble modeling the following use case:
>>
>> We store all our use behavior (clicks, searches, page views) in Hadoop
>> and we would like to add this into HBase so we can interactively
>> "explore" what our users are doing. For example we would like, given an
>> IP address get back a list of all searches, page views, clicks etc that
>> this user has attempted.
>>
>> My initial thought for something like this would be to create a table
>> "Logs" with a CF "Data" that have qualifiers of "Search", "Click" and
>> "View". Each column would have a row with the IP as its key.
>>
>> Is this along the right lines or am I missing something... sure feels
>> like I am. Would anyone please explain how I would accomplish what I am
>> looking for.
>>
>> Thanks

Mime
View raw message