hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From He Chen <air...@gmail.com>
Subject Re: HELP: I wanna store the output value into a list not write to the disk
Date Thu, 02 Apr 2009 18:59:16 GMT
It seems like the InMemoryFileSystem class has been deprecated in Hadoop
0.19.1. Why?

I want to reuse the result of reduce as the next time map's input. Cascading
does not work, because the data of each step is dependent. I set each
timestep mapreduce job as synchronization. If the InMemoryFileSystem is
deprecated. How can I reduce the I/O for each timestep's mapreduce job.

2009/4/2 Farhan Husain <russoue@gmail.com>

> Is there a way to implement some OutputCollector that can do what Andy
> wants
> to do?
>
> On Thu, Apr 2, 2009 at 10:21 AM, Rasit OZDAS <rasitozdas@gmail.com> wrote:
>
> > Andy, I didn't try this feature. But I know that Yahoo had a
> > performance record with this file format.
> > I came across a file system included in hadoop code (probably that
> > one) when searching the source code.
> > Luckily I found it: org.apache.hadoop.fs.InMemoryFileSystem
> > But if you have a lot of big files, this approach won't be suitable I
> > think.
> >
> > Maybe someone can give further info.
> >
> > 2009/4/2 andy2005cst <andy2005cst@gmail.com>:
> > >
> > > thanks for your reply. Let me explain more clearly, since Map Reduce is
> > just
> > > one step of my program, I need to use the output of reduce for furture
> > > computation, so i do not need to want to wirte the output into disk,
> but
> > > wanna to get the collection or list of the output in RAM. if it
> directly
> > > wirtes into disk, I have to read it back into RAM again.
> > > you have mentioned a special file format, will you please show me what
> is
> > > it? and give some example if possible.
> > >
> > > thank you so much.
> > >
> > >
> > > Rasit OZDAS wrote:
> > >>
> > >> Hi, hadoop is normally designed to write to disk. There are a special
> > file
> > >> format, which writes output to RAM instead of disk.
> > >> But I don't have an idea if it's what you're looking for.
> > >> If what you said exists, there should be a mechanism which sends
> output
> > as
> > >> objects rather than file content across computers, as far as I know
> > there
> > >> is
> > >> no such feature yet.
> > >>
> > >> Good luck.
> > >>
> > >> 2009/4/2 andy2005cst <andy2005cst@gmail.com>
> > >>
> > >>>
> > >>> I need to use the output of the reduce, but I don't know how to do.
> > >>> use the wordcount program as an example if i want to collect the
> > >>> wordcount
> > >>> into a hashtable for further use, how can i do?
> > >>> the example just show how to let the result onto disk.
> > >>> myemail is : andy2005cst@gmail.com
> > >>> looking forward your help. thanks a lot.
> > >>> --
> > >>> View this message in context:
> > >>>
> >
> http://www.nabble.com/HELP%3A-I-wanna-store-the-output-value-into-a-list-not-write-to-the-disk-tp22844277p22844277.html
> > >>> Sent from the Hadoop core-user mailing list archive at Nabble.com.
> > >>>
> > >>>
> > >>
> > >>
> > >> --
> > >> M. Raşit ÖZDAŞ
> > >>
> > >>
> > >
> > > --
> > > View this message in context:
> >
> http://www.nabble.com/HELP%3A-I-wanna-store-the-output-value-into-a-list-not-write-to-the-disk-tp22844277p22848070.html
> > > Sent from the Hadoop core-user mailing list archive at Nabble.com.
> > >
> > >
> >
> >
> >
> > --
> > M. Raşit ÖZDAŞ
> >
>
>
>
> --
> Mohammad Farhan Husain
> Research Assistant
> Department of Computer Science
> Erik Jonsson School of Engineering and Computer Science
> University of Texas at Dallas
>



-- 
Chen He
RCF CSE Dept.
University of Nebraska-Lincoln
US

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message