From user-return-12258-archive-asf-public=cust-asf.ponee.io@zookeeper.apache.org Mon Oct 28 07:06:35 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 3D040180638 for ; Mon, 28 Oct 2019 08:06:35 +0100 (CET) Received: (qmail 64058 invoked by uid 500); 28 Oct 2019 07:06:33 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 64035 invoked by uid 99); 28 Oct 2019 07:06:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 Oct 2019 07:06:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 12A9BC1508; Mon, 28 Oct 2019 07:06:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.019 X-Spam-Level: X-Spam-Status: No, score=-0.019 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id QcI0uJbwpvDJ; Mon, 28 Oct 2019 07:06:30 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.210.67; helo=mail-ot1-f67.google.com; envelope-from=eolivelli@gmail.com; receiver= Received: from mail-ot1-f67.google.com (mail-ot1-f67.google.com [209.85.210.67]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 11059C196C; Mon, 28 Oct 2019 07:00:28 +0000 (UTC) Received: by mail-ot1-f67.google.com with SMTP id m19so6033140otp.1; Mon, 28 Oct 2019 00:00:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=VmZvj9oOBrxsfLFWJ38mbh5Bz5hgXFT68xs7ug3c9zc=; b=hLpePfVUImOUdgVkOBw84K4eiQAyWHuXc2riMCtiHbptmMLiuhtOfYOBEmjUSbBgqv MP2zH/yOTUd5+1LHumYIfBATEJeEzMuxvGIcnxgDl0nKGdF3YazPVmBSKiZEzp/cKJsl 1o06+xwA1fm0887VVNmksbRbnZeJ7/dWvO79BhQ1Q2AcVXaFx3NBfZlmwvXq1SQ4iE7z wGzkd9tuB6/KAAgMfvAtVMT+FR8Xu647YMXMJaYZxy5CkprKSymrlCpx+DdCDPX3xXl4 WubKq7EI1OL3CN2Pg6/zDQtzNVyZCmX3RDdZEgBwj3IdzCx7qHhBeHrrwxTzR5S11sL1 W4IA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VmZvj9oOBrxsfLFWJ38mbh5Bz5hgXFT68xs7ug3c9zc=; b=RJZ5dVVv1jWrRNiW8MqMLRzGwNO3tI12HDLdPu2HgD8hgC0uJyaUbU/qGWlr8F6fpz OgVVQ/CD4xt7y/D+5I0BPgUkYvVtOOUJbNBT/Gw2tFz+8v7mENrckCqNHAIp1jomXqY1 gFhyag/skWW9v7wakWG0Q0+W20LFplgFXNK+xvsSyHYUPELTwf6VKx2nT3HsGH+qW7dK aMobYEzeUbz9FNQvwllYOU5+xIhFblRvQLp1H0K/5whWVZ/5Tq3WMFECS0y2YshIh2Xm +n9rNXkyG7jy5BQnk9BHRKiLAH2MT+Z2UspVnW0DeedDQYVw2oGoJUr9xcGw+Wd3rif4 Jl1Q== X-Gm-Message-State: APjAAAV5L63u/w/ohwLzGTRANpIypPSgwmHPnL/hAxycqMmSM0V2h5fo RXhpmosP02yE0ykGdlT1YSgYLmZ631mR0Lv9GMhANg== X-Google-Smtp-Source: APXvYqwtDhxpt2ExNm2IniyULusqrL8uOdf4NKAwvv5w0CTdeYj8jbMhkV8KXzM0RgGf0bUCDrRGr6cVxHseCvCJSdU= X-Received: by 2002:a9d:75c3:: with SMTP id c3mr6108319otl.326.1572246028096; Mon, 28 Oct 2019 00:00:28 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Enrico Olivelli Date: Mon, 28 Oct 2019 08:00:16 +0100 Message-ID: Subject: Re: String inconsistency issue when running ZK with OpenJDK 10 on SKL machines To: UserZooKeeper Cc: DevZooKeeper Content-Type: multipart/alternative; boundary="000000000000011a3d0595f30e69" --000000000000011a3d0595f30e69 Content-Type: text/plain; charset="UTF-8" Fangmin, Il lun 28 ott 2019, 02:23 Fangmin Lv ha scritto: > Hey everyone, > > (Forgot to add subject in the previous email, resent with clear subject.) > > I'd like to share some weird inconsistency bugs we saw recently on prod, > the root cause and potential fixes of it. It took us around a month to > investigate, reproduce and find out the root cause, hopefully the > informations here will help people avoid hitting this same potential issue. > > [Trigger conditions and behavior] > > The inconsistency issue only happened when running ZK with OpenJDK 10 on > SKL machines, and it's not because of bugs inside ZK but due to a > macro-assembly bug inside JDK. > > And the behavior of the issues might be: > > * NONODE returned when getData from a child exist when queried with > getChildren, and there is no delete issued > * NONODE error returned when try to create a child based on the parent node > just successfully created, and there is no delete issued > * No client is able to acquire the lock even though the previous session > who hold the lock already dead > > [Root cause] > > The direct cause of the misbehavior above is due to the key/value put into > the ZooKeeperServer.outstandingChangesForPath HashMap or the > DataNode.children HashSet are not visible to the future get or remove, > which caused the outstanding changes not visible when leader prepare the > following txns, or node being deleted but not removed from > DataNode.children. > > And the 'bad' HashMap/HashSet behavior is not because of concurrency bugs > inside ZK, but due to a macro-assembly bug which is used to generate the > String.equals intrinsic assembly code in JDK 9 and 10. The bug was > introduced in JDK-8144771 when adding AVX-512 instructions support in JDK > to optimize the String.equals intrinsic performance with 512 bit vector op > support. Due to the bug, the String.equals method may return false result > when using high band of CPU register (xmm16 - xmm31) with non-empty stack > on SKL machines where AVX-512 is available. > > The macro-assembly bug we hit is in vptest which is used in the > string_compare macro assembly code > < > http://hg.openjdk.java.net/jdk/jdk10/file/b09e56145e11/src/hotspot/cpu/x86/macroAssembler_x86.cpp#l4933 > >. > It uses add/sub instruction when saving/resuming register values > temporarily from stack, which will affect and distort the ZF (zero flag) in > FLAGS register from the previous test instruction. > > For our case, if the key exist in the DataNode.children HashSet, the test > instruction result will be zero, ZF bit will be set to 1, if the RSP value > is not 0 (e.g stack is not empty) after addptr code here, then the ZF bit > will be changed to 0, so String.equals compare during removeNode will > return false result, and the key won't be removed. > > There is bug reported in JDK-8207746, the behavior is different, we've > confirmed the issue by adding assembly code to log the issue in JDK 10. > > [Solutions] > > The possible mitigations are: > > 1. Disabling the AVX-512 with JVM option -XX:UseAVX=2 > 2. Using OpenJDK version higher than 10, which has fixed the issue in > JDK-8207746 > > Upgrading to OpenJDK 11+ is a better option, since 10 is not well > supported, and AVX-512 do helps improving performance. > > We use JDK 10 due to SSL quorum socket close stall issue mentioned in > ZOOKEEPER-3384 , and > the SO_LINGER option is not honored in JDK 11. We've unblocked JDK 11 by > asynchronously closing the quorum socket, and we're upstreaming that in > ZOOKEEPER-3574 . > > Thanks, > Fangmin > Thank you for sharing this. Do you have any pointer to the jdk11 bugs? Is it solved in 12+? I am running with jdk11-13 but without ssl, so never seen problems. Enrico > --000000000000011a3d0595f30e69--