hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fengyun RAO <raofeng...@gmail.com>
Subject any suggestions on IIS log storage and analysis?
Date Mon, 30 Dec 2013 07:58:57 GMT
Hi,

HDFS splits files into blocks, and mapreduce runs a map task for each
block. However, Fields could be changed in IIS log files, which means
fields in one block may depend on another, and thus make it not suitable
for mapreduce job. It seems there should be some preprocess before storing
and analyzing the IIS log files. We plan to parse each line to the same
fields and store in Avro files with compression. Any other alternatives?
Hbase?  or any suggestions on analyzing IIS log files?

thanks!

Mime
View raw message