hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From George Kousiouris <gkous...@mail.ntua.gr>
Subject access patterns investigation to dynamically toggle the replication factor in a hadoop cluster
Date Wed, 05 Sep 2012 16:11:28 GMT

Hi all,

As part of the research for an ongoing project, we are interested in 
investigating the ability  to predict data access patterns on a hadoop 
cluster. The purpose is to study the file access patterns (in a time 
series manner), so that proactive manipulation of data may be achieved. 
This for example may involve the increase/decrease of the replication 
factor in an Apache Hadoop cluster (and according HDFS) to deal with an 
upcoming predicted increase/decrease of data accesses.

So we would like your advise on some issues:
1) is this the correct mailing list? :)
2) would a changed replication factor translate to a better performance 
of a MR job (either by experience you may have or if you have in mind a 
report/paper etc. that has studied this)
3) do you find this interesting in general and something we should pursue?
4) are you aware of any related work on the topic we could use as a 
starting point?

Thanks for your help,

View raw message