hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Azuryy Yu <azury...@gmail.com>
Subject Re: RM ha issuses
Date Tue, 01 Apr 2014 02:06:47 GMT
Hi Karthik,
I ram a common MR job, it does work well during RM failover.

job progress:
(there is failover with red font)

14/04/01 10:01:38 INFO mapreduce.Job:  map 61% reduce 8%
14/04/01 10:01:40 INFO mapreduce.Job:  map 61% reduce 10%
14/04/01 10:01:41 INFO mapreduce.Job:  map 62% reduce 10%
14/04/01 10:01:44 INFO mapreduce.Job:  map 63% reduce 10%
14/04/01 10:01:47 INFO mapreduce.Job:  map 64% reduce 10%
14/04/01 10:02:36 INFO mapreduce.Job:  map 60% reduce 0%
14/04/01 10:02:40 INFO client.ConfiguredRMFailoverProxyProvider: Failing
over to rm2
14/04/01 10:03:00 INFO mapreduce.Job:  map 63% reduce 0%
14/04/01 10:03:02 INFO mapreduce.Job:  map 66% reduce 2%
14/04/01 10:03:04 INFO mapreduce.Job:  map 67% reduce 2%
14/04/01 10:03:06 INFO mapreduce.Job:  map 69% reduce 2%
14/04/01 10:03:08 INFO mapreduce.Job:  map 71% reduce 2%
14/04/01 10:03:10 INFO mapreduce.Job:  map 72% reduce 2%

So Hive job tasks are all restart during failover, please take a look.



On Tue, Apr 1, 2014 at 7:20 AM, Azuryy <azuryyyu@gmail.com> wrote:

> I will run a MR job to verify it.
>
> Stop RM means yarn-daemon.sh stop resourcemanager
>
> Thanks
> Sent from my iPhone5s
>
> > On 2014年4月1日, at 0:38, Karthik Kambatla <kasha@cloudera.com> wrote:
> >
> > Thanks for reporting this, Azuryy. Indeed, this is surprising.
> >
> > I don't quite understand how Hive works; do you mind running a vanilla MR
> > job and verifying if this is indeed the case. Also, when you say you
> > stopped the Active RM, you mean only the RM process - correct?
> >
> >
> >> On Mon, Mar 31, 2014 at 3:46 AM, Azuryy Yu <azuryyyu@gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> I built from trunk, and configured RM Ha, then I submitted a hive job.
> >> total 11 maps, then I stopped active RM when 6 maps finished.
> >>
> >> but Hive shows me all map tasks restat again. This is conflict with the
> >> design description.
> >>
> >> job progress:
> >> 2014-03-31 18:44:14,088 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU
> >> 713.84 sec
> >> 2014-03-31 18:44:15,128 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU
> >> 722.83 sec
> >> 2014-03-31 18:44:16,160 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU
> >> 731.95 sec
> >> 2014-03-31 18:44:17,191 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU
> >> 744.17 sec
> >> 2014-03-31 18:44:18,220 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU
> >> 756.22 sec
> >> 2014-03-31 18:44:19,250 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU
> >> 762.4 sec
> >> 2014-03-31 18:44:20,281 Stage-1 map = 68%,  reduce = 0%, Cumulative CPU
> >> 774.64 sec
> >> 2014-03-31 18:44:21,306 Stage-1 map = 70%,  reduce = 0%, Cumulative CPU
> >> 786.49 sec
> >> 2014-03-31 18:44:22,334 Stage-1 map = 70%,  reduce = 0%, Cumulative CPU
> >> 792.59 sec
> >> 2014-03-31 18:44:23,363 Stage-1 map = 73%,  reduce = 0%, Cumulative CPU
> >> 807.58 sec
> >> 2014-03-31 18:44:24,392 Stage-1 map = 77%,  reduce = 0%, Cumulative CPU
> >> 815.96 sec
> >> 2014-03-31 18:44:25,416 Stage-1 map = 80%,  reduce = 0%, Cumulative CPU
> >> 823.83 sec
> >> 2014-03-31 18:44:26,443 Stage-1 map = 80%,  reduce = 0%, Cumulative CPU
> >> 826.84 sec
> >> 2014-03-31 18:44:27,472 Stage-1 map = 82%,  reduce = 0%, Cumulative CPU
> >> 832.16 sec
> >> 2014-03-31 18:44:28,501 Stage-1 map = 84%,  reduce = 0%, Cumulative CPU
> >> 839.73 sec
> >> 2014-03-31 18:44:29,531 Stage-1 map = 86%,  reduce = 0%, Cumulative CPU
> >> 844.45 sec
> >> 2014-03-31 18:44:30,564 Stage-1 map = 82%,  reduce = 0%, Cumulative CPU
> >> 760.34 sec
> >> 2014-03-31 18:44:31,728 Stage-1 map = 0%,  reduce = 0%
> >> 2014-03-31 18:45:06,918 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU
> >> 213.81 sec
> >> 2014-03-31 18:45:07,952 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU
> >> 216.83 sec
> >> 2014-03-31 18:45:08,979 Stage-1 map = 7%,  reduce = 0%, Cumulative CPU
> >> 229.15 sec
> >> 2014-03-31 18:45:10,007 Stage-1 map = 11%,  reduce = 0%, Cumulative CPU
> >> 244.42 sec
> >> 2014-03-31 18:45:11,040 Stage-1 map = 14%,  reduce = 0%, Cumulative CPU
> >> 247.31 sec
> >> 2014-03-31 18:45:12,072 Stage-1 map = 18%,  reduce = 0%, Cumulative CPU
> >> 259.5 sec
> >> 2014-03-31 18:45:13,105 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 274.72 sec
> >> 2014-03-31 18:45:14,135 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 280.76 sec
> >> 2014-03-31 18:45:15,170 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 292.9 sec
> >> 2014-03-31 18:45:16,202 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 305.16 sec
> >> 2014-03-31 18:45:17,233 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 314.21 sec
> >> 2014-03-31 18:45:18,264 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 323.34 sec
> >> 2014-03-31 18:45:19,294 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 335.6 sec
> >> 2014-03-31 18:45:20,325 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 344.71 sec
> >> 2014-03-31 18:45:21,355 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 353.8 sec
> >> 2014-03-31 18:45:22,385 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 366.06 sec
> >> 2014-03-31 18:45:23,415 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 375.2 sec
> >> 2014-03-31 18:45:24,449 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 384.28 sec
> >> 2014-03-31 18:45:25,481 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU
> >> 396.54 sec
> >> 2014-03-31 18:45:26,512 Stage-1 map = 25%,  reduce = 0%, Cumulative CPU
> >> 408.72 sec
> >> 2014-03-31 18:45:27,549 Stage-1 map = 25%,  reduce = 0%, Cumulative CPU
> >> 414.69 sec
> >> 2014-03-31 18:45:28,582 Stage-1 map = 30%,  reduce = 0%, Cumulative CPU
> >> 426.99 sec
> >> 2014-03-31 18:45:29,614 Stage-1 map = 32%,  reduce = 0%, Cumulative CPU
> >> 439.25 sec
> >> 2014-03-31 18:45:30,653 Stage-1 map = 34%,  reduce = 0%, Cumulative CPU
> >> 448.25 sec
> >> 2014-03-31 18:45:31,683 Stage-1 map = 39%,  reduce = 0%, Cumulative CPU
> >> 460.5 sec
> >> 2014-03-31 18:45:32,723 Stage-1 map = 41%,  reduce = 0%, Cumulative CPU
> >> 469.63 sec
> >> 2014-03-31 18:45:33,754 Stage-1 map = 43%,  reduce = 0%, Cumulative CPU
> >> 478.67 sec
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message