hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: One mapper/reducer runs on a single JVM
Date Tue, 06 Nov 2012 16:27:45 GMT
If you exceed the amount of physical memory available, memory pages will be written to disk
in a temp space. The act of 'swapping' the memory pages from memory to disk and back again
is known as 'swap'. 

HBase is highly sensitive to the latency of swapping memory in and out of physical memory
to disk. You need to avoid swap when running HBase.  It will crash a region server and ultimately
you can end up with a cascading failure and HBase will go down. 



On Nov 5, 2012, at 11:06 PM, Lin Ma <linlma@gmail.com> wrote:

> Thanks Michael,
> "If you are running just Hadoop, you could have a little swap. Running HBase, fuggit
about it." -- could you give a bit more information about what do you mean swap and why forget
for HBase?
> regards,
> Lin
> On Tue, Nov 6, 2012 at 12:46 PM, Michael Segel <michael_segel@hotmail.com> wrote:
> Mappers and Reducers are separate JVM processes.
> And yes you need to take in to account the amount of memory the machine(s) when you configure
the number of slots.
> If you are running just Hadoop, you could have a little swap. Running HBase, fuggit about
> On Nov 5, 2012, at 7:12 PM, Lin Ma <linlma@gmail.com> wrote:
> > Hello Hadoop experts,
> >
> > I have a question in my mind for a long time. Supposing I am developing M-R program,
and it is Java based (Java UDF, implements mapper or reducer interface). My question is, in
this scenario, whether a mapper or a reducer is a separate JVM process? E.g. supposing on
a machine, there are 4 mappers, they are 4 individual processes? I am also wondering whether
the processes on a single machine will impact each other when each JVM wants to get more memory
to run faster?
> >
> > thanks in advance,
> > Lin
> >
> >

View raw message