hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sugandha Naolekar <sugandha....@gmail.com>
Subject Re: :!
Date Mon, 03 Aug 2009 07:09:05 GMT
This is ridiculous. What do you mean by unsubscribe.?? I have few queries
and dats why have logged in to the corresponding forum.

On Mon, Aug 3, 2009 at 12:33 PM, A BlueCoder <bluecoder008@gmail.com> wrote:

> unsubscribe
>
> On Mon, Aug 3, 2009 at 12:01 AM, Sugandha Naolekar
> <sugandha.n87@gmail.com>wrote:
>
> > dats fine. But, if I place the data in HDFS and then run map reduce code
> to
> > provide compression, then the data will get compressed in sequence files
> > but, even the original data will reside in the memory;thereby leading or
> > causing a kind of redundancy of data...
> >
> > Can u pls suggest me a way out?/
> >
> > On Mon, Aug 3, 2009 at 12:07 PM, prashant ullegaddi <
> > prashullegaddi@gmail.com> wrote:
> >
> > > I don't think you will be able to compress some data unless it's on
> HDFS.
> > > What you can do is
> > > 1. Manually compress the data on the machine where the data resides.
> > Then,
> > > copy the same to
> > >  HDFS. or
> > > 2. Copy the data without compressing to HDFS, then run a job which just
> > > emits the data as it reads
> > >  in key/value pair. You can set
> > > FileOutputFormat.setOutputCompressorClass(job,GzipCodec.class) so
> > >  that output gets gzipped.
> > >
> > > Does that solve your problem?
> > >
> > > btw you didn't exactly specify your data size (how many TBs).
> > >
> > > On Mon, Aug 3, 2009 at 11:02 AM, Sugandha Naolekar
> > > <sugandha.n87@gmail.com>wrote:
> > >
> > > > Yes, You are right. Here goes the details related::
> > > >
> > > > -> I have a Hadoop cluster of 7 nodes. Now there is this 8th machine,
> > > which
> > > > is not a part of the hadoop cluster.
> > > > -> I want to place the data of that machine into the HDFS. Thus,
> before
> > > > placing it in HDFS, I want to compress it, and then dump in the HDFS.
> > > > -> I have 4 datanodes in my cluster. also, data might get extended
> upto
> > > > tera
> > > > bytes.
> > > > -> Also, i have set thr replication factor as 2.
> > > > -> I guess, for compression, I will have to run map reduce...?
> > > > right..please
> > > > tel me the complete approach that is needed to be followed.
> > > >
> > > > On Mon, Aug 3, 2009 at 10:48 AM, prashant ullegaddi <
> > > > prashullegaddi@gmail.com> wrote:
> > > >
> > > > > By "I want to compress the data first and then place it in HDFS",
> do
> > > you
> > > > > mean you want to compress the data
> > > > > locally and then copy to DFS?
> > > > >
> > > > > What's the size of your data? What's the capacity of HDFS?
> > > > >
> > > > > On Mon, Aug 3, 2009 at 10:45 AM, Sugandha Naolekar
> > > > > <sugandha.n87@gmail.com>wrote:
> > > > >
> > > > > > I want to compress the data first and then place it in HDFS.
> Again,
> > > > while
> > > > > > retrieving the same, I want to uncompress it and place on the
> > desired
> > > > > > destination. Can this be possible. How to get started? Also,
I
> want
> > > to
> > > > > get
> > > > > > started with actual coding part of compression and MAP reduce.
> > PLease
> > > > > > suggest me aptly...!
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards!
> > > > > > Sugandha
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards!
> > > > Sugandha
> > > >
> > >
> >
> >
> >
> > --
> > Regards!
> > Sugandha
> >
>



-- 
Regards!
Sugandha

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message