hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick maillard <nicolas.maill...@fifty-five.com>
Subject Re: Hive Hbase 0.94 ClassNotFoundException com.google.protobuf.Message
Date Thu, 25 Oct 2012 16:27:21 GMT
Hi Jean-Daniel

We are trying different types of software to solve our size issue.
We aggregate data on website interactions and this can easily go up to a couple
of millions lines with a couple tens of interactions types per day. We would
like to keep a rolling 2 years history. We expect this will grow in time as this
is the current situation.

We have queries that must anwser very rapidly on this set for example uses that
did interaction A and/or B in the last week and that have a profile C and are
currently on page Y. This typically should be realtime, either different Hbase
tables or fine tuned indexes.

For statistical purposes we have longer queries where we will, a of couple times
a day or week extract relevant susbsets and apply differents types of queries
depending on the need. Some are more regular and some are on customer inquiries.
These susbsets can be extracted and calculated to populate an index or serve as
raw data for open research and analysis.

>From what I understand Hbase is not an "open query" system but must be designed
with the need and question in mind. For other purposes I'd like to use Habse as
a long term data storage and extract relevant susbsets when needed to populate
other systems that will do the job needed.

Does this seem a rationnal approach? 

View raw message