hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sugandha Naolekar <sugandha....@gmail.com>
Subject Re: :!
Date Mon, 03 Aug 2009 05:32:38 GMT
Yes, You are right. Here goes the details related::

-> I have a Hadoop cluster of 7 nodes. Now there is this 8th machine, which
is not a part of the hadoop cluster.
-> I want to place the data of that machine into the HDFS. Thus, before
placing it in HDFS, I want to compress it, and then dump in the HDFS.
-> I have 4 datanodes in my cluster. also, data might get extended upto tera
bytes.
-> Also, i have set thr replication factor as 2.
-> I guess, for compression, I will have to run map reduce...? right..please
tel me the complete approach that is needed to be followed.

On Mon, Aug 3, 2009 at 10:48 AM, prashant ullegaddi <
prashullegaddi@gmail.com> wrote:

> By "I want to compress the data first and then place it in HDFS", do you
> mean you want to compress the data
> locally and then copy to DFS?
>
> What's the size of your data? What's the capacity of HDFS?
>
> On Mon, Aug 3, 2009 at 10:45 AM, Sugandha Naolekar
> <sugandha.n87@gmail.com>wrote:
>
> > I want to compress the data first and then place it in HDFS. Again, while
> > retrieving the same, I want to uncompress it and place on the desired
> > destination. Can this be possible. How to get started? Also, I want to
> get
> > started with actual coding part of compression and MAP reduce. PLease
> > suggest me aptly...!
> >
> >
> >
> > --
> > Regards!
> > Sugandha
> >
>



-- 
Regards!
Sugandha

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message