From mapreduce-issues-return-94454-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Sat May 9 17:22:03 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id EB808180637 for ; Sat, 9 May 2020 19:22:02 +0200 (CEST) Received: (qmail 63985 invoked by uid 500); 9 May 2020 17:22:02 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 63966 invoked by uid 99); 9 May 2020 17:22:02 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 May 2020 17:22:02 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 7561DE3157 for ; Sat, 9 May 2020 17:22:01 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 20053781F15 for ; Sat, 9 May 2020 17:22:00 +0000 (UTC) Date: Sat, 9 May 2020 17:22:00 +0000 (UTC) From: "Bilwa S T (Jira)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (MAPREDUCE-7169) Speculative attempts should not run on the same node MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-7169?page=3Dcom.atlas= sian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D= 17103381#comment-17103381 ]=20 Bilwa S T edited comment on MAPREDUCE-7169 at 5/9/20, 5:21 PM: --------------------------------------------------------------- Hi [~ahussein] What we are trying to achieve here is speculative attempt shouldn't be laun= ched on faulty node. So even if task gets killed there is no point launchin= g it on that node as it will slow.This is expected behaviour {quote}=C2=A0 * Assuming that a new speculative attempt is created. Following the implem= entation, the new attempt X will have blacklisted nodes and skipped racks r= elevant to the original taskAttempt Y * Assuming taskAttempt Y is killed before attempt X gets assigned. * The RMContainerAllocator would still assign a host to attemptX based on = the dated blacklists. Is this the expected behavior? or it is supposed to clear attemptX' blackl= isted nodes?{quote} Yes i think these two cases should be handled {quote} * Should that object be synchronized? I believe there are more than= one thread reading/writing to that object. Perhaps changing=C2=A0{{taskAtt= emptToEventMapping}}=C2=A0to=C2=A0{{concurrentHashMap}}=C2=A0would be suffi= cient. What do you think? {quote}* In=C2=A0{{taskAttemptToEventMapping}}, the data is only removed wh= en the taskAttempt is assigned. If taskAttempt is killed before being assig= ned,=C2=A0{{taskAttemptToEventMapping}}=C2=A0would still have the taskAttem= pt. {quote}{quote} Will update this {quote} * Racks are going to be black listed too. Not just nodes. I believe= that the javadoc and description in default.xml should emphasize that enab= ling the flag also avoids the local rack unless no other rack is available = for scheduling.{quote} Actually when task attempt is killed by default Avataar is VIRGIN. this is = defect which needs to be addressed. If speculative task attempt is killed i= t is launched as normal task attempt {quote} * why do we need=C2=A0{{mapTaskAttemptToAvataar}}=C2=A0when each ta= skAttempt has a field called=C2=A0{{avataar}}=C2=A0?{quote} How do you get taskattempt details in RMContainerAllocator?? {quote} - That's a design issue. One would expect that RequestEvent's lifet= ime should not survive=C2=A0{{handle()}}=C2=A0call. Therefore, the metadata= should be consumed by the handlers. In the patch,=C2=A0{{ContainerRequestE= vent.blacklistedNodes}}=C2=A0could be a field in taskAttempt. Then you won'= t need=C2=A0{{TaskAttemptBlacklistManager}}=C2=A0class.{quote} Thanks=20 was (Author: bilwast): Hi [~ahussein] What we are trying to achieve here is speculative attempt shouldn't be laun= ched on faulty node. So even if task gets killed there is no point launchin= g it on that node as it will slow.This is expected behaviour {quote}=C2=A0 * Assuming that a new speculative attempt is created. Following the implem= entation, the new attempt X will have blacklisted nodes and skipped racks r= elevant to the original taskAttempt Y * Assuming taskAttempt Y is killed before attempt X gets assigned. * The RMContainerAllocator would still assign a host to attemptX based on = the dated blacklists. Is this the expected behavior? or it is supposed to clear attemptX' blackl= isted nodes?{quote} Yes i think these two cases should be handled {quote} * Should that object be synchronized? I believe there are more than= one thread reading/writing to that object. Perhaps changing=C2=A0{{taskAtt= emptToEventMapping}}=C2=A0to=C2=A0{{concurrentHashMap}}=C2=A0would be suffi= cient. What do you think? {quote}* In=C2=A0{{taskAttemptToEventMapping}}, the data is only removed wh= en the taskAttempt is assigned. If taskAttempt is killed before being assig= ned,=C2=A0{{taskAttemptToEventMapping}}=C2=A0would still have the taskAttem= pt. {quote}{quote} Will update this {quote} * Racks are going to be black listed too. Not just nodes. I believe= that the javadoc and description in default.xml should emphasize that enab= ling the flag also avoids the local rack unless no other rack is available = for scheduling.{quote} Actually when task attempt is killed by default Avataar is VIRGIN. this is = defect which needs to be addressed. If speculative task attempt is killed i= t is launched as normal task attempt {quote} * why do we need=C2=A0{{mapTaskAttemptToAvataar}}=C2=A0when each ta= skAttempt has a field called=C2=A0{{avataar}}=C2=A0?{quote} How do you get taskattempt details in RMContainerAllocator?? {quote} - That's a design issue. One would expect that RequestEvent's lifet= ime should not survive=C2=A0{{handle()}}=C2=A0call. Therefore, the metadata= should be consumed by the handlers. In the patch,=C2=A0{{ContainerRequestE= vent.blacklistedNodes}}=C2=A0could be a field in taskAttempt. Then you won'= t need=C2=A0{{TaskAttemptBlacklistManager}}=C2=A0class.{quote} > Speculative attempts should not run on the same node > ---------------------------------------------------- > > Key: MAPREDUCE-7169 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7169 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: yarn > Affects Versions: 2.7.2 > Reporter: Lee chen > Assignee: Bilwa S T > Priority: Major > Attachments: MAPREDUCE-7169-001.patch, MAPREDUCE-7169-002.patch, = MAPREDUCE-7169-003.patch, MAPREDUCE-7169.004.patch, MAPREDUCE-7169.005.patc= h, image-2018-12-03-09-54-07-859.png > > > I found in all versions of yarn, Speculative Execution may set = the speculative task to the node of original task.What i have read is only= it will try to have one more task attempt. haven't seen any place mentioni= ng not on same node.It is unreasonable.If the node have some problems lead = to tasks execution will be very slow. and then placement the speculative t= ask to same node cannot help the problematic task. > In our cluster =EF=BC=88version 2.7.2=EF=BC=8C2700 nodes=EF=BC= =89=EF=BC=8Cthis phenomenon appear almost everyday. > !image-2018-12-03-09-54-07-859.png!=20 -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org