hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amandeep Khurana <ama...@gmail.com>
Subject Re: HBase Schema Design for clickstream data
Date Wed, 27 Jun 2012 18:01:23 GMT

What would be your read patterns later on? Are you going to read per
session, or for a time period, or for a set of users, or process through
the entire dataset every time? That would play an important role in
defining your keys and columns.


On Tue, Jun 26, 2012 at 1:34 PM, Mohit Anchlia <mohitanchlia@gmail.com>wrote:

> I am starting out with a new application where I need to store users
> clickstream data. I'll have Visitor Id, session id along with other page
> related data. I am wondering if I should just key off randomly generated
> session id and store all the page related data as columns inside that row
> assuming that this would also give good distribution accross region
> servers. In a session user could send 100s of HTML requests and get
> responses. If someone is already doing this in HBase I would like to learn
> more about it as to how they have designed the schema.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message