hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Dageroth <Benjamin.Dager...@webtrekk.com>
Subject Web Analytics Use case?
Date Tue, 03 Nov 2009 13:28:00 GMT

I am currently evalutating whether Hadoop might be an alternative to our current system. We
are providing a web analytics solution for very large websites and run every analysis on all
collected data - we do not aggregate the data. This results in very large amounts of data
that are processed for each query and currently we are using an in memory database by Exasol
with really a lot of RAM, so that it does not take longer than a few seconds and for more
complicated queries not longer than a minute to deliever the results.

The solution however is quite expensive and given the growth of data I'd like to explore alternatives.
I have read about NoSQL Datastores and about Hadoop, but I am not sure whether it is actually
a choice for our web analytics solution. We are collecting data via a trackingpixel which
gives data to a trackingserver which writes it to disk once the session of a visitor is done.
Our current solution has a large number of tables and the queries running the data can be
quite complex:

How many user who came over that keyword and were from that city did actually buy the advertised
product? Of these users, what other pages did they look at. Etc.

Would this be a good case for Hbase, Hadoop, Map/Reduce and perhaps Mahout?

Thanks for any thoughts,

Benjamin Dageroth, Business Development Manager
Webtrekk GmbH
Boxhagener Str. 76-78, 10245 Berlin
fon 030 - 755 415 - 360
fax 030 - 755 415 - 100
Amtsgericht Berlin, HRB 93435 B
Geschäftsführer Christian Sauer


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message