hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhou, Yunqing" <azure...@gmail.com>
Subject Re: A question about Mapper
Date Sat, 04 Oct 2008 08:20:42 GMT
Thanks a lot for such a detailed explanation.but I think the reducer here is
unnecessary, so I set the reducer number to 0.
then, I'd like to solve them all in mappers.
so I met with the problem.
Thanks anyway.

On Sat, Oct 4, 2008 at 3:33 PM, Joman Chu <jomanc@andrew.cmu.edu> wrote:

> Hello,
>
> I assume you want to associate {a,b}, {c,d,e}, and {f} into sets.
>
> One way to do this is by associating some value with each flag and then
> emitting the data associated with that value. For example,
>
> flag
> a
> b
> flag
> c
> d
> e
> flag
> f
>
> I define flag,a,b,c,d,e,f to be the key while in the Mapper context.
>
> Whenever the mapper sees a key, it will emit <UID, Key>. UID is some unique
> identifier associated with a certain set, and Key is the key that was passed
> into the mapper. We are essentially inverting the association here.
>
> Let's step through this testcase.
>  1. Choose UID = mapper1flag1.
>  2. <flag, null> -> Mapper -> <mapper1flag1, flag>
>  3. We have reached a flag, so we change the UID = mapper1flag2.
>  4. <a, null> -> Mapper -> <mapper1flag2, a>
>  5. <b, null> -> Mapper -> <mapper1flag2, b>
>  6. <flag, null> -> Mapper -> <mapper1flag2, flag>
>  7. We have reached a flag, so we change the UID = mapper1flag3.
>  8. <c, null> -> Mapper -> <mapper1flag3, c>
>  9. <d, null> -> Mapper -> <mapper1flag3, d>
> 10. <e, null> -> Mapper -> <mapper1flag3, e>
> 11. <flag, null> -> Mapper -> <mapper1flag3, flag>
> 12. We have reached a flag, so we change the UID = mapper1flag4.
> 13. <f, null> -> Mapper -> <mapper1flag3, f>
> 14. EOF
>
> Then the reducers will collect all values with the same UID, so here is
> what we get:
>
> 1. <mapper1flag1, {flag}> -> Reducer -> <{}, null>
> 2. <mapper1flag2, {a,b,flag}> -> Reducer -> <{a,b}, null>
> 3. <mapper1flag3, {c,d,e,flag}> -> Reducer -> <{c,d,e}, null>
> 4. <mapper1flag4, {f}> -> Reducer -> <{f}, null>
>
> Hopefully this solves your problem.
>
> On Sat, October 4, 2008 2:48 am, Zhou, Yunqing said:
> > but the close() function doesn't supply me a Collector to put pairs in.
> >
> > Is it reasonable for me to store a reference of the collector in advance?
> >
> >
> > I'm not sure if the collector is still available then.
> >
> >
> >
> >
> > On Sat, Oct 4, 2008 at 12:17 PM, Joman Chu <jomanc@andrew.cmu.edu>
> wrote:
> >
> >
> >> Hello,
> >>
> >> Does MapReduceBase.close() fit your needs? Take a look at
> >> http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred
> >> /MapReduceBase.html#close()
> >>
> >> On Fri, October 3, 2008 11:36 pm, Zhou, Yunqing said:
> >>> the input is as follows. flag a b flag c d e flag f
> >>>
> >>> then I used a mapper to first store values and then emit them all
> >>> when met with a line contains "flag" but when the file reached its
> >>> end, I have no chance to emit the last record.(in this case ,f) so how
> >>> can I detect
> >> the
> >>> mapper's end of its life , or how can I emit a last record before a
> >> mapper
> >>> exits.
> >>>
> >>> Thanks
> >>>
> >>
> >> Have a good one, -- Joman Chu Carnegie Mellon University School of
> Computer
> >> Science 2011 AIM: ARcanUSNUMquam
> >>
> >>
> >
>
>
> --
> Joman Chu
> Carnegie Mellon University
> School of Computer Science 2011
> AIM: ARcanUSNUMquam
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message