Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5F7E5200CA9 for ; Fri, 16 Jun 2017 10:59:47 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5E272160BDD; Fri, 16 Jun 2017 08:59:47 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7DE02160BD2 for ; Fri, 16 Jun 2017 10:59:46 +0200 (CEST) Received: (qmail 72924 invoked by uid 500); 16 Jun 2017 08:59:45 -0000 Mailing-List: contact user-help@kudu.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@kudu.apache.org Delivered-To: mailing list user@kudu.apache.org Received: (qmail 72914 invoked by uid 99); 16 Jun 2017 08:59:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Jun 2017 08:59:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id BBFA11806D5 for ; Fri, 16 Jun 2017 08:59:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.39 X-Spam-Level: ** X-Spam-Status: No, score=2.39 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id HlXgfC8NWYhQ for ; Fri, 16 Jun 2017 08:59:42 +0000 (UTC) Received: from mail-yb0-f178.google.com (mail-yb0-f178.google.com [209.85.213.178]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 7AEDB5F20C for ; Fri, 16 Jun 2017 08:59:42 +0000 (UTC) Received: by mail-yb0-f178.google.com with SMTP id 84so10757280ybe.0 for ; Fri, 16 Jun 2017 01:59:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=/SvxTWPesIzV4hYwO2Uyb3MBX9eKz6CCIxIbO4Z9jf4=; b=dvbZFjtoTz28EQa1lLnscib/HGckwtuGtsZpJ/d8Z1EBy/pmZz8cMcQsNVSXmalAjQ 0Lgpfqc2xbXZ81/tNQsVbD89s9sWMitCQ9ZX5//flRnmZlBAxF36Y2JWp5IC8m7KmVJD q6MG70V/RSHz7T4ZMyfzu6IjXQxdPTd33ovye1s2RpA+gsTpigfbPYfxS1ummwIH+HVZ 0csYhLG98PsGorIj4DQ414C4xZUuQS7Fwls7C7DkvSdlhi7+KWet7S0DRM/484/C2Ea+ jDPOiovGHn+KwQr+2Glm6gk+klI81C4ThicgbhzSbyctDZjHCPCShg/6Q7Z2nIYUsnzH T1rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=/SvxTWPesIzV4hYwO2Uyb3MBX9eKz6CCIxIbO4Z9jf4=; b=NlgdqE3bb4QczAsy0hjf/84QZ3XmLzs4mujgB37yHXHXb17LllCK1beJEk59LirL8C ZjFGJP6je6bdGSyuKGcYpSBfkbi1Gw5hz4KePBK6vaWmV9Xyu1G6bd/zjxU7l0jPKcvB CUvl5n9mEkQQdNU5y+kQ7Mue8yKq6VT5qg04raszjRM3iOqD+BrrF0ZkRCK1GhIwt/iH rXldPJF17SD9hfCrYHtN9huf7yIO+dyWgaO3gZ8Q67Rv5ObcMbKv6YfbPlsENLxVlapB Ffoqb7yXOgUBag5JZWfRTkvpjAfmw1adKgvaAHI3DzJN8sgjAjYsQ9udY8+862ymnZ0Z swoQ== X-Gm-Message-State: AKS2vOwX9T75K2yD0zk823qcpFVsNyMlSuxeVV8Ptz0A4dkeAiZS/uUe 95GH1mm08eIsvzBDWZikWB8J5zeKXw== X-Received: by 10.37.193.131 with SMTP id r125mr7435747ybf.75.1497603576640; Fri, 16 Jun 2017 01:59:36 -0700 (PDT) MIME-Version: 1.0 Received: by 10.13.204.203 with HTTP; Fri, 16 Jun 2017 01:59:36 -0700 (PDT) In-Reply-To: References: From: Jason Heo Date: Fri, 16 Jun 2017 17:59:36 +0900 Message-ID: Subject: Re: tserver died by clock unsync. To: user@kudu.apache.org Content-Type: multipart/alternative; boundary="94eb2c056912333ac70552100172" archived-at: Fri, 16 Jun 2017 08:59:47 -0000 --94eb2c056912333ac70552100172 Content-Type: text/plain; charset="UTF-8" Hi. Congrat. Apache Kudu 1.4.0 To prevent tserver from dying accidentally, I've changed LOG(FATAL) to LOG(WARNING) I wanted to know it is safe to continue if ntp_gettime() in GetClockTime returns TIME_ERROR Could anyone can help me? Regards, Jason 2017-06-15 12:40 GMT+09:00 Jason Heo : > Hi, > > I'm using Apache Kudu 1.4.0 > > Yesterday, 6 tservers die at the same time. Following message is logged > for each tserver. > > > F0614 14:58:32.868551 111454 hybrid_clock.cc:227] > > Couldn't get the current time: Clock unsynchronized. > > Status: Service unavailable: > > Error reading clock. Clock considered unsynchronized > > We are already using ntpd, and in /var/log/messages, ntpd related message > is logged. > > Jun 14 14:58:38 hostname ntpdate[10231]: step time server ip_addr offset > -0.000168 sec > > We use our own ntp service. I don't know what's the exact reason, but It's > suspicious that our ntp service is malfunctioned or network is not good > temporarily. > > The problem is that this could happen again and again. > > So, I'm considering modifying source code of Kudu from LOG(FATAL) to > LOG(WARN) so that tserver does not exit on unsync. > > uint64_t now_usec; > > uint64_t error_usec; > > Status s = WalltimeWithError(&now_usec, &error_usec); > > if (PREDICT_FALSE(!s.ok())) { > > LOG(FATAL) << Substitute("Couldn't get the current time: Clock > unsynchronized. " > > "Status: $0", s.ToString()); > > } > > > So, I question is that is it OK modifying LOG(FATAL) to LOG(WARN) of > above code? and wanted to know this can preventing from dying of tserver > when clock unsynced? > > Thanks. > > Jason, > > Regard > --94eb2c056912333ac70552100172 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi.

Congrat. Apache Kudu 1.4.0

To prevent tserver from dying accidentally, I've chang= ed LOG(FATAL) to LOG(WARNING)

=
I wanted to know it is safe to continue if ntp_gettime() in GetClockTime=C2=A0returns TIME_ERROR
Could anyone can help me?

Regards,

Jason



2017-06-15 12:40 GMT+0= 9:00 Jason Heo <jason.heo.sde@gmail.com>:
Hi,

I'm using= Apache Kudu 1.4.0

Yesterday, 6 tservers die at th= e same time. Following message is logged for each tserver.



We are already using ntpd, and in /var/log/messages, ntpd related message is logged.
<= div>

F0= 614 14:58:32.868551 111454 hybrid_clock.cc:227]

Couldn't get the current time: Clock unsynchron= ized.

Status: Service un= available:

Error reading= clock. Clock considered unsynchronized

= Jun 14 14:58:38 hostname ntpdate[10231]: s= tep time server ip_addr offset -0.000168 sec


We use our own ntp service. I don= 9;t know what's the exact reason, but It's suspicious that our ntp = service is malfunctioned or network is not good temporarily.

=
The problem is that this could happen again and again.

So, I'm considering modifying source code of Kudu from = LOG(FATAL) to LOG(WARN) so that tserver does not exit on unsync.
=

=C2=A0 uint64_t now_usec;

= =C2=A0 uint64_t error_usec;

=C2=A0 Status s =3D Wallt= imeWithError(&now_usec, &error_usec);

=C2=A0 = if (PREDICT_FALSE(!s.ok())) {

=C2=A0 =C2=A0 LOG(FATAL) <&= lt; Substitute("Couldn't get the current time: Clock unsynchronize= d. "

=C2=A0 =C2=A0 =C2=A0 =C2=A0 "St= atus: $0", s.ToString());

=C2=A0 }=



So, I question = is that is it OK modifying LOG(FATAL) to LOG(WARN) of above code= ? and wanted to know this can preventing from dying of tserver when clock u= nsynced?

Thanks.

Jason,

Regard

--94eb2c056912333ac70552100172--