hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmad Ali Iqbal <ahmad.ali.iq...@gmail.com>
Subject More access to nodes in a distributed cache
Date Wed, 16 Dec 2009 23:43:34 GMT
Hi All,

I am interested to know that can we use hadoop for applications where they
need more control over the data and it can specify which node will do which
part of the processing or the storage. For instance, suppose that I have two
data files (datasets, say 1 and 2) and setup a hadoop with two datanodes (A
and B) in a distributed cache, can I specify dataset 1 should load on node A
and dataset 2 should load on node B? Also I have two tasks (a and b), is it
possible to perform task a on node A and task b on node B?

In other words, I want to supply a pattern of operation file (specifying
store and operation tasks) to hadoop to perform, will it be possible? If it
is, I would appreciate a link discussing this or if a sample
code/application doing this.

Thanks a lot,


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message